Qblox results not matching between qibo versions

qiboteam / qibolab

Quantum hardware module and drivers for Qibo.

https://qibo.science

Apache License 2.0

43 stars 15 forks source link

Qblox results not matching between qibo versions #919

Open DavidSarlle opened 5 months ago

DavidSarlle commented 5 months ago

After calibrating the iqm5q chip with qblox, we have detected that some routines are not working or returning data as expected. We have compared the results obtained with main and an older branch from December to check the differences. Here you can see some examples produced with the same action runcard and platform but with different versions of the qibolab driver and qibocal:

OPT 1:

qubit spec qblox (alvaro/latest): qibo 0.2.4 qibocal 0.0.7 /nfs/users/david.fuentes/qibocal qibolab 0.1.5 /nfs/users/david.fuentes/qibolab

OPT 2:

qubit spec qblox (david/iqm5q_latest == main): qibo 0.2.8 qibocal 0.0.10 /nfs/users/david.fuentes/qibocal qibolab 0.1.7 /nfs/users/david.fuentes/qibolab

qubit spec (all qubits)

OPT 1: http://login.qrccluster.com:9000/sAtV-_GDT2S3qiSeTag8vA== OPT 2: http://login.qrccluster.com:9000/iFEq5TLGSn2sRnj5JHU3JQ==

classify + t2 (q0)

OPT 1: http://login.qrccluster.com:9000/xExLnD9_RlCDZSCC9cUpFA== OPT 2: http://login.qrccluster.com:9000/kQrzGr9sT6C2KQiOk0yTzA==

I am working on compare more routines, but as you can see, qubit spectroscoopy is not showing the same result using the same chip, qubits, platform and parameters.json. To me seems that some changes has been introduced in the qblox driver from december that are producing differences between results, and, after checking the results, we are pretty confident that the good ones are the ones produced with the older version of qibolab.

I will update with more data comparison between versions as soon as possible

alecandido commented 5 months ago

Hi @DavidSarlle, thanks for reporting the issue.

I see in your two options that many things are changing between the two options, including the Qibolab and Qibocal versions, and possibly even the platform specification (including all the instruments configurations). The Qibolab version can also affect the version of the qcodes driver used for Qblox. How have you been able to pin down the discrepancy to just the Qblox driver?

Moreover, the results reported are also being acquired in different days, possibly with a different "environment". And despite that, most of the frequencies are compatible to the MHz precision (or even lower), with the only remarkable difference in Qubit 3 spectroscopy. But that's so huge that it doesn't seem to be just related to a difference in the driver (though I may be wrong in many ways). And the fit for Qubit 4 in the old setup is even wrong...

Could you explain more about your inference, to better understand where the problem could be localized?

DavidSarlle commented 5 months ago

@alecandido regarding the platform specifications and runcard are exactly the same, you can check in the branches used. I have used in both cases exactly the same parameters.json and the same platform.py.

If the qcodes driver is affecting the data we should debug it, because the data should be the same. Also in the T2 experiment.

I am reporting the problem only for qblox because is the driver that we are using now with the iqm5 chip (totally characterized and well controlled) and also, because I did not see the same differences using Zh (before we disconnected)

Regarding the envs, obviously they should be different, because the test is done with different versions of qibo.

Regarding the days, I can run one after each other and the results are the same. Here you can see a fresh test executed right now:

opt 1: http://login.qrccluster.com:9000/tiYyLgh-QHyCECK7sVv8tw== opt 2: http://login.qrccluster.com:9000/ycDFRq38QTOl75-tPQCA7Q==

The freqs for q3 does not change that much between executions done with minutes of difference. I am pretty sure about that. Also if you test the new freq in q3 in a single shot experiment you will see that the freq fited is not the correct one, and the one from the old version produces a much better assg. fidelity. Also if you check the results the phase are changing between both cases in all the qubits. Also the freqs, if you run the same experiment many times, one after the other, using same branch and versions of qibo, does not change that much and this MHz of difference affect a lot in the fidelity of the qubit.

Also check the T2 experiment pls. I will try to give you more examples with other routines. I am working on it.

The fit is not important, it fails because the data.

DavidSarlle commented 5 months ago

Another @alecandido @hay-k example with classify and Rabi length signal:

opt 1: http://login.qrccluster.com:9000/1cB8FENOQdWkFth1S6LxBg== opt 2: http://login.qrccluster.com:9000/R6UIW0GsROm1x71vsleuoA==

And I am pretty sure that the pi pulse is well chracterized and, in both cases, we are using same parameters.json and platform.py values.

Check the Rabis pls

alecandido commented 5 months ago

If the qcodes driver is affecting the data we should debug it, because the data should be the same. Also in the T2 experiment.

It could just because our driver is built on top of that, so changes on their side can have the same impact as changes on our side.

I am reporting the problem only for qblox because is the driver that we are using now with the iqm5 chip (totally characterized and well controlled) and also, because I did not see the same differences using Zh (before we disconnected)

Ok, that's relevant information. Thanks.

Regarding the envs, obviously they should be different, because the test is done with different versions of qibo.

What I meant with "environment" was not the software virtual environment, but the experimental setup.

Regarding the days, I can run one after each other and the results are the same. Here you can see a fresh test executed right now:

opt 1: http://login.qrccluster.com:9000/tiYyLgh-QHyCECK7sVv8tw== opt 2: http://login.qrccluster.com:9000/ycDFRq38QTOl75-tPQCA7Q==

But being able to reproduce it today makes it much more reliable, thanks for rerunning!

The freqs for q3 does not that much as in q3 for example between executions done with minutes of difference. I am pretty sure about that. Also if you test the new freq in q3 in a single shot experiment you will see that the freq fited is not the correct one, and the one from the old version produces a much better assg. fidelity. Also if you check the results the phase are changing between both cases in all the qubits. Also the freqs, if you run the same experiment many times, one after the other, using same branch and versions of qibo, does not change that much and this MHz of difference affect a lot in the fidelity of the qubit.

Also check the T2 experiment pls. I will try to give you more examples with other routines. I am working on it.

I would not worry too much about the MHz (or sub-MHz) difference right now, it is a smaller instance of the same phenomenon leading to the failing fit. And it's a symptom. The problem is clearly the origin, since in the best case (qubits 0-2) the results are much more noisy, while in the worst ones (qubits 3,4) they have a completely different shape.

Btw, are you aware of any specificity of qubits 3, that could make it much more prone to these changes?

However, it would be for sure useful if you can extract the Q1ASM generated by the two experiments. Especially for the spectroscopies it should be pretty simple, and generating the correct Q1ASM is the best thing we can do in the driver (together with setting the correct parameters, that's the only part out of the Q1ASM).

Currently, it's not too simple to extract, but not even that complicate:

you can set the debug folder, and look for a file named as described here: https://github.com/qiboteam/qibolab/blob/f335e4d37376c973ea8372b6c95905245409fe71/src/qibolab/instruments/qblox/cluster_qrm_rf.py#L903-L910
or just place your own instruction there to save wherever you wish

In case of troubles, I believe @aorgazf knows the most about how to do it (I've done it myself just a few times, and mostly using hacks to avoid a connection, he's much more experienced in actual use). But you may also be more experienced than me :) The moment you extract all the Q1ASM for the instruments (just for the qubit spectroscopy), could you put them in a folder, and upload the zip here?

This information is very valuable as diagnostic.

Thanks again for the report and the additional details!

DavidSarlle commented 5 months ago

Another @hay-k @alecandido example with Ramsey:

opt 1: http://login.qrccluster.com:9000/nh0qZh1RQ_6EJbUmzqUUOQ== opt 2: http://login.qrccluster.com:9000/M5nF89I9ROCyrh_JmdCeJg==

DavidSarlle commented 5 months ago

If the qcodes driver is affecting the data we should debug it, because the data should be the same. Also in the T2 experiment.

It could just because our driver is built on top of that, so changes on their side can have the same impact as changes on our side.

I am reporting the problem only for qblox because is the driver that we are using now with the iqm5 chip (totally characterized and well controlled) and also, because I did not see the same differences using Zh (before we disconnected)

Ok, that's relevant information. Thanks.

Regarding the envs, obviously they should be different, because the test is done with different versions of qibo.

What I meant with "environment" was not the software virtual environment, but the experimental setup.

The experimental setup is not changing that much between days or executions

Regarding the days, I can run one after each other and the results are the same. Here you can see a fresh test executed right now: opt 1: http://login.qrccluster.com:9000/tiYyLgh-QHyCECK7sVv8tw== opt 2: http://login.qrccluster.com:9000/ycDFRq38QTOl75-tPQCA7Q==

But being able to reproduce it today makes it much more reliable, thanks for rerunning!

The freqs for q3 does not that much as in q3 for example between executions done with minutes of difference. I am pretty sure about that. Also if you test the new freq in q3 in a single shot experiment you will see that the freq fited is not the correct one, and the one from the old version produces a much better assg. fidelity. Also if you check the results the phase are changing between both cases in all the qubits. Also the freqs, if you run the same experiment many times, one after the other, using same branch and versions of qibo, does not change that much and this MHz of difference affect a lot in the fidelity of the qubit. Also check the T2 experiment pls. I will try to give you more examples with other routines. I am working on it.

I would not worry too much about the MHz (or sub-MHz) difference right now, it is a smaller instance of the same phenomenon leading to the failing fit. And it's a symptom. The problem is clearly the origin, since in the best case (qubits 0-2) the results are much more noisy, while in the worst ones (qubits 3,4) they have a completely different shape.

Btw, are you aware of any specificity of qubits 3, that could make it much more prone to these changes?

Thew q3 has a problem in the special cable/connector that we use to connect the chip to the fridge lines (is an hypothesis). we have observed that much more power should been applied to get a 40ns pi pulse. It was excatly the same with ZI. But, again, with 2 versions of qibo and qblox driver, the results are totally different as you can see. Also take into account the phase, it is not plotted in the same way. And please, note that even with the known problem, the results are consistent in each execution in both versions.

However, it would be for sure useful if you can extract the Q1ASM generated by the two experiments. Especially for the spectroscopies it should be pretty simple, and generating the correct Q1ASM is the best thing we can do in the driver (together with setting the correct parameters, that's the only part out of the Q1ASM).

Currently, it's not too simple to extract, but not even that complicate:
* you can set the debug folder, and look for a file named as described here:
  https://github.com/qiboteam/qibolab/blob/f335e4d37376c973ea8372b6c95905245409fe71/src/qibolab/instruments/qblox/cluster_qrm_rf.py#L903-L910

* or just place your own instruction there to save wherever you wish
In case of troubles, I believe @aorgazf knows the most about how to do it (I've done it myself just a few times, and mostly using hacks to avoid a connection, he's much more experienced in actual use). But you may also be more experienced than me :) The moment you extract all the Q1ASM for the instruments (just for the qubit spectroscopy), could you put them in a folder, and upload the zip here?

I do not know how to do it, maybe @hay-k can do it in order to debug faster?

This information is very valuable as diagnostic.

Thanks again for the report and the additional details!

DavidSarlle commented 5 months ago

@alecandido please, check the latest examples (rabi and ramsey) and let me know if you need more... But you can try yourself using the branch reported and main.

To me seems that we have a problem in the RO, maybe in how the start of the RO pulse is managed and set in the pulse sequence. But it is only an intuition after reviewing and running many many characterizations results over the iqm5q chip with ZH and qblox.

We have to continue working on chracterization, that is why I am reporting the issue and not checking in depth, but I can confirm that we have been able to fully chracterize CZs with qblox and the old version of the code and not with main.

Please let me know if we can help with something else.

alecandido commented 5 months ago

The experimental setup is not changing that much between days or executions

And please, note that even with the known problem, the results are consistent in each execution in both versions.

Indeed, you're definitely right. In general, I believe the results are not always that stable (also see the option 2 behavior of qubit 4 in the previous and today runs), but given the amount of precision with which you can reproduce the two results, it is clear that it does not depend on the experimental setup. Thanks again for rerunning!

@alecandido please, check the latest examples (rabi and ramsey) and let me know if you need more... But you can try yourself using the branch reported and main.

I already saw them before my previous message. I agree with you about the quality of these results, but I'd focus on the spectroscopies from now, since they are informative enough and the simplest. It is clear that just observing the qubit spectroscopy of qubit 3 there are remarkable differences (even 4, but one is enough), that are not easily explained by anything else than execution. We could check the definition of the spectroscopy pulse sequence in the two versions of Qibocal (and I will do it!), but I'm pretty sure that nothing changed.

To me seems that we have a problem in the RO, maybe in how the start of the RO pulse is managed and set in the pulse sequence. But it is only an intuition after reviewing and running many many characterizations results over the iqm5q chip with ZH and qblox.

Yes, I'm not sure whether that's the problem, but I would place my bet there as well.

Please let me know if we can help with something else.

Maybe running a resonator spectroscopy on qubit 3 (or on all qubits, fwiw), to collect more info about the role of the drive pulse. However, not even this is strictly required.

We have to continue working on chracterization, that is why I am reporting the issue and not checking in depth, but I can confirm that we have been able to fully chracterize CZs with qblox and the old version of the code and not with main.

I see. Unfortunately, I can not shift much of my own effort on this issue right now, because of other commitments as well. However, I'm planning to rewrite Qblox driver for 0.2 (cf. #868, and I will follow up from there), and that's definitely valuable information. Having a correct baseline implementation will be necessary to move to the new one, and if it's possible to have that in main, it would be much better (but I may not be able to do it immediately).

DavidSarlle commented 5 months ago

The experimental setup is not changing that much between days or executions And please, note that even with the known problem, the results are consistent in each execution in both versions.

Indeed, you're definitely right. In general, I believe the results are not always that stable (also see the option 2 behavior of qubit 4 in the previous and today runs), but given the amount of precision with which you can reproduce the two results, it is clear that it does not depend on the experimental setup. Thanks again for rerunning!

@alecandido please, check the latest examples (rabi and ramsey) and let me know if you need more... But you can try yourself using the branch reported and main.

I already saw them before my previous message. I agree with you about the quality of these results, but I'd focus on the spectroscopies from now, since they are informative enough and the simplest. It is clear that just observing the qubit spectroscopy of qubit 3 there are remarkable differences (even 4, but one is enough), that are not easily explained by anything else than execution. We could check the definition of the spectroscopy pulse sequence in the two versions of Qibocal (and I will do it!), but I'm pretty sure that nothing changed.

To me seems that we have a problem in the RO, maybe in how the start of the RO pulse is managed and set in the pulse sequence. But it is only an intuition after reviewing and running many many characterizations results over the iqm5q chip with ZH and qblox.

Yes, I'm not sure whether that's the problem, but I would place my bet there as well.

Please let me know if we can help with something else.

Maybe running a resonator spectroscopy on qubit 3 (or on all qubits, fwiw), to collect more info about the role of the drive pulse. However, not even this is strictly required.

We have to continue working on chracterization, that is why I am reporting the issue and not checking in depth, but I can confirm that we have been able to fully chracterize CZs with qblox and the old version of the code and not with main.

I see. Unfortunately, I can not shift much of my own effort on this issue right now, because of other commitments as well. However, I'm planning to rewrite Qblox driver for 0.2 (cf. #868, and I will follow up from there), and that's definitely valuable information. Having a correct baseline implementation will be necessary to move to the new one, and if it's possible to have that in main, it would be much better (but I may not be able to do it immediately).

@alecandido thanks for the effort. The only thing is that fixing this problems (if finally there are, that I think so) we will be able to run latest version of qibocal that gives us better execution times and functionalities when characterizing. Otherwise, if the results are not correct, we can not use the latest version of qibolab and qibocal with qblox.

DavidSarlle commented 5 months ago

More examples of discrepancies between results @alecandido @hay-k :

standard RB:

OPT 1: http://login.qrccluster.com:9000/8YF3IcsmSaemvZoaKhqWYQ== OPT 2: http://login.qrccluster.com:9000/V176eCBEShqotHSFnHFwYQ==

DavidSarlle commented 4 months ago

@alecandido did you have time to take a look into this problem? It is quite important that we check it in order to open the use of the chip to anybody form TII/external colaborator using qibolab/qibocal main repos.

hay-k commented 4 months ago

@DavidSarlle I started taking a look at this today. Will let you know once I have any updates or need clarifications.

DavidSarlle commented 4 months ago

Thanks a lot for the effort and taking care of the issue @hay-k . Let me know if you need help with something.

hay-k commented 3 months ago

By now I have closely examined all given examples, and I will summarize the findings and potential further steps below.

First of all, none of the examples show non-matching results between official qibo/qibolab/qibocal versions. If one uses older official versions (from Dec 2023) they will produce matching results with the current main.

What the examples show, is non-matching results between official qibocal/qibolab (no matter current main, or from Dec 2023) and custom versions of qibocal/qibolab that were branched out some time in Dec 2023. Here is a detailed explanation for each example:

Qubit spectroscopy. Results for most of the qubits are roughly the same, but for qubit 3 they are considerably different. This is because of difference in instrument parameters used. In particular, the execution with the custom branches uses mixer calibration for the instrument qrm_rf1, while the execution with main branches does not. If the same mixer calibration is used with the main branch, the result of the experiment is identical to the results coming from the custom branches. Same goes for an old official release from Dec 2023.
T2 and Ramsey experiments. With main branches (or in official releases from Dec 2023) these do not work because the readout pulse start is not swept in tandem with the drive pulse start sweep.

All differences arise because the custom branches implement new functionality that were never transferred to the official releases:

The custom branch for qibolab exposes mixer calibration parameters, while the official releases do not, so there is no way to set them.
The custom branch for qibocal pads the readout pulse and sweeps its start in parallel with the drive pulse.

Options to proceed:

For this there is already reported issue https://github.com/qiboteam/qibolab/issues/900, and there are plans to properly incorporate exposing mixer parameters in a unified (across various instruments) interface in upcoming qibolab major release 0.2. If this is not needed urgently, we can wait for that. If needed urgently, we can cherry-pick the corresponding changes from the custom qibolab brunch, and patch the current 0.1.* versions, to mitigate the risks related to waiting for the 0.2 release.
I am not sure if there is an issue for this, but this is well known, and again 0.2 major release introduces a better interface for constructing pulse sequences, where users (qibocal) can describe this type of sweeps without tricks like padding. Perhaps we can adopt some patching strategy here as well, however this is out of the scope of qibolab: it is a patch for qibocal so I will let qibocal developers to decide (@andrea-pasquale @Edoardo-Pedicillo ?).

andrea-pasquale commented 3 months ago

Thanks @hay-k for the summary.

2. I am not sure if there is an issue for this, but this is well known, and again 0.2 major release introduces a better interface for constructing pulse sequences, where users (qibocal) can describe this type of sweeps without tricks like padding. Perhaps we can adopt some patching strategy here as well, however this is out of the scope of qibolab: it is a patch for qibocal so I will let qibocal developers to decide (@andrea-pasquale @Edoardo-Pedicillo ?).

Regarding T2 and Ramsey experiment, I'm aware that in some branches there are custom patches to align qblox with the other instruments. That being sad, it is still possible to run T2 using t2_sequences which performs a for loop. If unrolling is supported by Qblox that is also an option both for t2 and ramsey.

Regarding merging custom patches, we need to assess if these changes are strictly necessary in the short term. If not, I would suggest to wait for 0.2 given that qibocal will also change significantly to adapt to the new qibolab layout.

DavidSarlle commented 3 months ago

Thanks @hay-k @andrea-pasquale for checking. Regarding the problems reported. To me, the objective is to be able to use main as soon as possible with qblox and iqm5q. The chip is totally characterized and we want to benchmark 2q gates and also, we need to expose the chip to all the QRC groups that may want to use it.

Taking in mind that, in my opinion we should apply all the fixes without waiting the new qibolab or qibocal release. The issues were opened long time ago (May 15th and June 20th) and we have been using the old branch until now, migrating/fixing routines and the qblox driver in order to characterize the chip.

Also, many new useful functionalities and improvements in execution times and fits, have been introduced on main qibocal/qibolab, and we want to use them instead of continuously migrate/make compatible them to/with the old branch.

Specific comments regarding the problems found:

If mixer calibration was the problem, we should implement as soon as possible this issue was opened by @aorgazf on May 15: https://github.com/qiboteam/qibolab/issues/900
T2 were some examples, and even if we can run it using t2_sequences, maybe there are more routines affected by the same problem. As the final objective is to be able to characterize, control, benchmark and use 2q gates, we should test that all the routines are working as expected and this issue is not affecting them. Again for that, we need to fix the issues found.
Regarding the new functionalities that were never transferred. The issues were also opened long time ago, but nobody implemented them on main (https://github.com/qiboteam/qibolab/issues/921, https://github.com/qiboteam/qibolab/issues/920 adn https://github.com/qiboteam/qibolab/issues/900). As we also detected problems with the results, we decide to fix them. So, the implementation(temporary patch) in main should not be complicated, and is strictly needed for running the latest characterization.

I can not say how to proceed, but we need to have something that works in main for the iqm5q as soon as possible that can run 2q gates as the older branches do.

hay-k commented 3 months ago

@DavidSarlle thanks for the summary. I know about the existence of all those issues. I am not responsible for priorities and setting expectations on when each issue is going to be resolved, but if you have the feeling that this type of issues are not getting enough priority recently, I can at least say that a lot of priority in the team is put on finishing the development of qibolab version 0.2. And that major release is not orthogonal to these issues - constant appearance of this type of needs is one of the reasons 0.2 was planned in the first place, so that new features can be added to qibolab in consistent manner, instead of scattered ad-hoc code everywhere. As I suggested, to mitigate for delays (both foreseen and unforeseen), and enable you to use latest features of qibolab/qibocal, some of the ad-hoc features from the custom branch can be ported to main. I will be working on this.

DavidSarlle commented 3 months ago

@hay-k thanks. I really appreciate the effort of porting the necessary features of the custom branch into main. We need them to continue fine tuning and benchmark the 2q gates.