Open andrewdunsworth opened 8 years ago
Pictorial example: We want this: pi/2 ----- pi ------ pi/2 measure But we get this: pi/2 ----- pi pi/2 ------ measure
This was in the SiQ branch under s = ['','SiQ','data', 'Qubit', 'SQSi Pla 16', 'Dry', '151203-Cal3'] to be specific (sq.spin_echo_long).
Looks like a delay and a pi/2 pulse are switching places?
yes.
I don't think this is a problem for packet sniffing, at least not yet. The first thing you should do is look at the jump table commands your code is actually generating to send to the servers. If you rerun your code, do you get blips at the same location? (That is, do the same inputs to your scan function reliably generate bad sequences?)
We ran our scan with the same input multiple times and it only occasionally failed. Also, we noticed that within the same run of the experiment, some of the stats would be bad and some of the stats would be good.
Here's a good run:
which corresponds to:
Play SRAM - IDLE - Play SRAM - IDLE - Play SRAM
and here's a bad run with the exact same parameters
which looks more like:
Play SRAM - Short IDLE - Play SRAM - Play SRAM - ????
Does this point to more of a hardware and firmware issue?
Given that the software will always be outputting the same thing I think this indicates a hardware failure.
within the same run of the experiment, some of the stats would be bad and some of the stats would be good.
Does this mean that after a single start command, some of the N repetitions output a correct sequence and some an incorrect sequence? That would definitely be an fpga hardware problem.
Yes it appears so :(
Could also be SRAM writes happening in the middle of the sequence.
On Fri, Dec 4, 2015 at 4:32 PM, zchen088 notifications@github.com wrote:
Yes it appears so :(
— Reply to this email directly or view it on GitHub https://github.com/martinisgroup/servers/issues/281#issuecomment-162117820 .
Yes, we've definitely had issues with SRAM writes happening while the sequence is running, at least when debugging. I don't know that it has been thoroughly tested in the wild, e.g. with long scans and stuff.
To look at this, you can plug two of the scope lines into the mon1 and mon2 connectors on the back of the board, and then play around with what gets written to the monitors (i.e. you'd want to look at SRAM start and sequence start, etc.)
On Fri, Dec 4, 2015 at 7:39 PM, ejeffrey notifications@github.com wrote:
Could also be SRAM writes happening in the middle of the sequence.
On Fri, Dec 4, 2015 at 4:32 PM, zchen088 notifications@github.com wrote:
Yes it appears so :(
— Reply to this email directly or view it on GitHub < https://github.com/martinisgroup/servers/issues/281#issuecomment-162117820
.
— Reply to this email directly or view it on GitHub https://github.com/martinisgroup/servers/issues/281#issuecomment-162118551 .
Specifically here is the code you need to change to change the monitors (IIRC, things have been shifted a bit recently):
https://github.com/martinisgroup/servers/blob/master/fpgalib/dac.py#L1205
I'll try to get started debugging, but can we get someone with more GHzDAC-fu to help us next week? This seems important.
can we get someone with more GHzDAC-fu to help us next week? This seems important.
Absolutely. I finished the data run on Vince and have nothing major on my todo list. Happy to help. Some things you guys could do:
@andrewdunsworth and/or @zchen088 can you summarize what we learned yesterday here?
When this problem occurs it occurs on every stat of an experiment. In the case of sq.swap_spectroscopy it seemed to only effect one board (the z board) where instead of running the step_down, jump table idle, and step_up it ran the step_down, step_up (all that was written in SRAM) repeatedly every 16 us. i.e:
intended scan: -----|_________|---------
actual scan: -----|_|------------|_|----------|_|----...
the readout, xy, and adc boards did not seem to be effected by this problem.
We then learned that the way the DAC boards work is that the daisy chain fires on every stat of an experiment (which wasn't obvious to me).
Ok, so problem occurs on every repetition of a given experiment. Does it occur every time that particular sequence is run?
Ok, so problem occurs on every repetition of a given experiment.
We are not 100% sure of that.
Does it occur every time that particular sequence is run?
No.
No, it happens seemingly randomly about every 10 to 100 times a given scan is run. And only if jump table commands are called (we are using idle, nop, and end). And every time we saw it occur when monitoring on the scope (after figuring out how to make it trigger on the error) it happened in clumps that corresponded to the number of stats.
@andrewdunsworth you can edit your comment posts after they're posted, fyi.
Here are the two sequences as seen on the scope:
Normal sequence:
Messed up sequence:
Green is the z-board, blue is the xy, yellow is the ADC demod, and pink is the monitor_1 of the readout dac board which is set to 1 which according to
https://github.com/martinisgroup/servers/blob/u/maffoo/dac-docs/fpgalib/docs/GHzDAC_v8.md
is the SRAM write:
It seems like it plays the SRAM (without the jump table delay) 16 times at 16 us intervals and then plays the SRAM with the correct delay after that. I have set up the scope to only trigger once (I think before we were getting false pulse sequences because it was triggering multiple times with the messed up Z pulses).
When we put the adc_1 monitor on the SRAM write it seems like it writes at the same place for both the 'good' pulse sequence and the 'messed up' one:
good:
messed:
We put blue on monitor_0 = 5 which is the SRAM running option. It seems to know that it is running SRAM the entire time as the pulses look like this:
Good: Bad:
@andrewdunsworth, do you have packet wireshark packet captures for two runs like these? It would be really helpful to see exactly what got sent to the boards.
Ok so if we set blue to IDLE and pink to END (I'm guessing these are the jump table commands), it looks like: good: bad:
So it seems to think that it is idling for the entire length of the messed up pulse sequence... which doesn't make sense (notice that there is a slight down tick where the actual idle occurs at the end of the messed up sequence).
This really makes it look like its a jumptable problem and not a hardware problem.
We are working on getting wireshark up and running.
We now have wireshark but are unsure how to parse the data.
We are now sharking wires.
Once like every hundred scans or so the jump table command will happen before the pulse sequence (or visa versa). I noticed this in my version of spin echo where the second pi/2 pulse would sometimes just jump right next to the pi pulse, but I think this happens during our version of swap spectroscopy as well as there will be random points at long delay times that have a very large population (as if the pi pulse is jumping right next to the readout pulse). Can someone packet sniff this out? @maffoo @DanielSank @zchen088 @JulianSKelly