qiskit-community / qiskit-experiments

Qiskit Experiments
https://qiskit-community.github.io/qiskit-experiments/
Apache License 2.0
160 stars 126 forks source link

Experiment serialization #1392

Open coruscating opened 9 months ago

coruscating commented 9 months ago

Suggested feature

A frequently requested feature is the ability to reconstruct an entire experiment using syntax similar to ExperimentData, i.e. BaseExperiment.load(id) and BaseExperiment.save().

PRs

Open questions

ItamarGoldman commented 8 months ago

The user usage is to be able to work as following:

  1. Get the experiment running (jobs submitted and all)
  2. Save the experiment (this gives me an experiment id)
  3. Terminate the python script
  4. Start a new python session and load the experiment from the database service using the previously obtained experiment id
  5. Run analysis

An option for interface would be to load experiment from the experiment ID:

exp = BaseExperiment.load(exp_id)

then for the user to have experiment data that is connected to the experiment object, he can use:

exp_data = ExperimentData.load(exp)

The reason to break the creation of the experiment and the experiment data is for the case where a user would want to create an experiment with the same configuration and analysis option but doesn't need the data. Example: The user want to run T1 experiment daily with the same configuration. In this case the data of the previous experiment isn't relevant and it will ease on the user not to configure an experiment twice.

Alternatively, the user can load experiment data without experiment object using experiment ID:

exp_data = ExperimentData.load(exp_id)

For the expected usage of loading both experiment data and the experiment object, we could make use of a utility function:

exp, exp_data = load_experiment(exp_id)

This will require that there will be a place to store experiment ID in the BaseExperiment class.

coruscating commented 8 months ago

Thanks for the interface patterns @ItamarGoldman. I agree it's good to have the option to load either the experiment or experiment data or both, since these operations could be slow for large experiments. Since there's precedent for BaseExperiment methods returning ExperimentData objects, we can also consider something like exp, exp_data = BaseExperiment.load(exp_id, return_exp_data=True) as the pattern to load both objects instead of introducing another utility function.

As for ExperimentData.load(exp), I think this overloading makes the already confusing interface even more confusing, and the experiment object would have to provide not only experiment ID but also the service for this to work. I prefer to keep the ExperimentData.load(exp_id) pattern and improve the current service and provider parameters so that we're not passing in provider=QiskitRuntimeService(). What do you think?

eliarbel commented 8 months ago

As for ExperimentData.load(exp), I think this overloading makes the already confusing interface even more confusing

Good point, I tend to agree with keeping the function taking exp_id only

ItamarGoldman commented 8 months ago

After some tests I have some insights. ExperimentDecoder knows how to import the class from the module, so we can change the method BaseExperiment,from_config(cls, cofig) and BaseAnalysis,from_config(cls, cofig) to initialize the class using:

ret = config.cls(*config.args, **config.kwargs)

In addition, I think we should match the load method in ExperimentData class. So the method should have the following signature:

@classmethod
def load(
        cls,
        experiment_id: str,
        service: Optional[IBMExperimentService] = None,
        provider: Optional[Provider] = None,
    ) -> "BaseExperiment":
    # Add validity check here
    if service is None:
        if provider is None:
            raise ExperimentDataError(
                "Loading an experiment requires a valid Qiskit provider or experiment service."
            )
        service = cls.get_service_from_provider(provider)
    # getting experiment config and analysis config from db
    experiment_config, analysis_config = service.load_experiment_config(exp_id)
    # reconstructing the experiment (here we can support custom experiment)
    reconstructed_experiment = cls.from_config(experiment_config)
    # creating analysis class (here we can support custom experiment)
    reconstructed_experiment.analysis = reconstructed_experiment.analysis.from_config(analysis_config)
    # returning experiment obj
    return reconstructed_experiment

To load experiment and experiment data at the same time we can overload the function:

@classmethod
def load(
        cls,
        experiment_id: str,
        service: Optional[IBMExperimentService] = None,
        provider: Optional[Provider] = None,
        return_exp_data,
    ) -> "BaseExperiment":

    # Using previous implementation to reconstruct experiment
    reconstructed_experiment = cls.load(exp_id, service, costume_experiment_class)
    # getting experiment data
    reconstructed_experiment_data = ExperimentData.load(exp_id, service)
    return reconstructed_experiment, reconstructed_experiment_data

Another thing I thought of is to load experiment with custom experiment and analysis. In this case the user will provide us the classes and we will use them to load experiment_config and analysis_config. This will be easy done by passing the class by the user.

What do you think?

wshanks commented 8 months ago

Some points from external discussion:

ItamarGoldman commented 8 months ago

Another idea that was mentioned is that the load function will load the transpiled circuits of the experiment. for it, we will need to take the following thing into consideration:

wshanks commented 8 months ago

Another idea that was mentioned is that the load function will load the transpiled circuits of the experiment.

This idea is in @coruscating's original list above:

Add ability to run custom transpiled circuits (https://github.com/Qiskit-Extensions/qiskit-experiments/pull/1222 or new PR). This allows QE to run circuits saved in job data. Users have the ability to decide between running the transpiled circuits or generating new circuits.

In the most recent discussion we said this support is not required in the initial implementation. I think this use case should be supported, but I don't think it should be the default behavior. I see it as a more advanced use case because there is more that can go wrong. The old circuits might not be valid after an update to the backend or when trying to run on a different backend or if the experiment class has been updated since the previous run. Also, downloading the circuits could be slower than regenerating them, like you mention.

coruscating commented 7 months ago

@wshanks discussed in the meeting today that we should support the use case of not running the analysis when running an experiment and then saving experiment config in a dummy ExperimentData container.

wshanks commented 7 months ago

In the meeting, I was thinking that experiment.analysis = False was the way to avoid running analysis immediately on an experiment. I forgot about experiment.run(analysis=False). So I am not sure if there is much to do for that last point (#1423 might already support it), but we should make sure that it does.