Jump Table Commands in Wrong Order

andrewdunsworth commented 8 years ago

Once like every hundred scans or so the jump table command will happen before the pulse sequence (or visa versa). I noticed this in my version of spin echo where the second pi/2 pulse would sometimes just jump right next to the pi pulse, but I think this happens during our version of swap spectroscopy as well as there will be random points at long delay times that have a very large population (as if the pi pulse is jumping right next to the readout pulse). Can someone packet sniff this out? @maffoo @DanielSank @zchen088 @JulianSKelly

zchen088 commented 8 years ago

Pictorial example: We want this: pi/2 ----- pi ------ pi/2 measure But we get this: pi/2 ----- pi pi/2 ------ measure

andrewdunsworth commented 8 years ago

This was in the SiQ branch under s = ['','SiQ','data', 'Qubit', 'SQSi Pla 16', 'Dry', '151203-Cal3'] to be specific (sq.spin_echo_long).

DanielSank commented 8 years ago

Looks like a delay and a pi/2 pulse are switching places?

andrewdunsworth commented 8 years ago

yes.

maffoo commented 8 years ago

I don't think this is a problem for packet sniffing, at least not yet. The first thing you should do is look at the jump table commands your code is actually generating to send to the servers. If you rerun your code, do you get blips at the same location? (That is, do the same inputs to your scan function reliably generate bad sequences?)

zchen088 commented 8 years ago

We ran our scan with the same input multiple times and it only occasionally failed. Also, we noticed that within the same run of the experiment, some of the stats would be bad and some of the stats would be good.

Here's a good run: screenshot_2015-12-04-15-59-18

which corresponds to: Play SRAM - IDLE - Play SRAM - IDLE - Play SRAM

and here's a bad run with the exact same parameters screenshot_2015-12-04-15-59-41

which looks more like: Play SRAM - Short IDLE - Play SRAM - Play SRAM - ????

Does this point to more of a hardware and firmware issue?

JulianSKelly commented 8 years ago

Given that the software will always be outputting the same thing I think this indicates a hardware failure.

maffoo commented 8 years ago

within the same run of the experiment, some of the stats would be bad and some of the stats would be good.

Does this mean that after a single start command, some of the N repetitions output a correct sequence and some an incorrect sequence? That would definitely be an fpga hardware problem.

zchen088 commented 8 years ago

Yes it appears so :(

ejeffrey commented 8 years ago

Could also be SRAM writes happening in the middle of the sequence.

On Fri, Dec 4, 2015 at 4:32 PM, zchen088 notifications@github.com wrote:

Yes it appears so :(

— Reply to this email directly or view it on GitHub https://github.com/martinisgroup/servers/issues/281#issuecomment-162117820 .

pomalley commented 8 years ago

Yes, we've definitely had issues with SRAM writes happening while the sequence is running, at least when debugging. I don't know that it has been thoroughly tested in the wild, e.g. with long scans and stuff.

To look at this, you can plug two of the scope lines into the mon1 and mon2 connectors on the back of the board, and then play around with what gets written to the monitors (i.e. you'd want to look at SRAM start and sequence start, etc.)

On Fri, Dec 4, 2015 at 7:39 PM, ejeffrey notifications@github.com wrote:

Could also be SRAM writes happening in the middle of the sequence.

On Fri, Dec 4, 2015 at 4:32 PM, zchen088 notifications@github.com wrote:

Yes it appears so :(

— Reply to this email directly or view it on GitHub < https://github.com/martinisgroup/servers/issues/281#issuecomment-162117820

.

— Reply to this email directly or view it on GitHub https://github.com/martinisgroup/servers/issues/281#issuecomment-162118551 .

pomalley commented 8 years ago

Specifically here is the code you need to change to change the monitors (IIRC, things have been shifted a bit recently):

https://github.com/martinisgroup/servers/blob/master/fpgalib/dac.py#L1205

zchen088 commented 8 years ago

I'll try to get started debugging, but can we get someone with more GHzDAC-fu to help us next week? This seems important.

DanielSank commented 8 years ago

can we get someone with more GHzDAC-fu to help us next week? This seems important.

Absolutely. I finished the data run on Vince and have nothing major on my todo list. Happy to help. Some things you guys could do:

Peter's suggestion about looking at the monitors.
Install Wireshark and @maffoo's plug-in for dissecting our FPGA packets.
Modify fpgaseqtransmon to log the packets it's preparing for the qubit sequencer and make sure they're right. I think there may already be a flag you can set in the module so that it writes packet data to a file.

DanielSank commented 8 years ago

@andrewdunsworth and/or @zchen088 can you summarize what we learned yesterday here?

andrewdunsworth commented 8 years ago

When this problem occurs it occurs on every stat of an experiment. In the case of sq.swap_spectroscopy it seemed to only effect one board (the z board) where instead of running the step_down, jump table idle, and step_up it ran the step_down, step_up (all that was written in SRAM) repeatedly every 16 us. i.e:

intended scan: -----|_________|---------
actual scan:   -----|_|------------|_|----------|_|----...

the readout, xy, and adc boards did not seem to be effected by this problem.

We then learned that the way the DAC boards work is that the daisy chain fires on every stat of an experiment (which wasn't obvious to me).

maffoo commented 8 years ago

Ok, so problem occurs on every repetition of a given experiment. Does it occur every time that particular sequence is run?

DanielSank commented 8 years ago

Ok, so problem occurs on every repetition of a given experiment.

We are not 100% sure of that.

Does it occur every time that particular sequence is run?

No.

andrewdunsworth commented 8 years ago

No, it happens seemingly randomly about every 10 to 100 times a given scan is run. And only if jump table commands are called (we are using idle, nop, and end). And every time we saw it occur when monitoring on the scope (after figuring out how to make it trigger on the error) it happened in clumps that corresponded to the number of stats.

DanielSank commented 8 years ago

@andrewdunsworth you can edit your comment posts after they're posted, fyi.

andrewdunsworth commented 8 years ago

Here are the two sequences as seen on the scope:

Normal sequence: 20151209_134024

Messed up sequence: 20151209_134024

Green is the z-board, blue is the xy, yellow is the ADC demod, and pink is the monitor_1 of the readout dac board which is set to 1 which according to

https://github.com/martinisgroup/servers/blob/u/maffoo/dac-docs/fpgalib/docs/GHzDAC_v8.md

is the SRAM write:

It seems like it plays the SRAM (without the jump table delay) 16 times at 16 us intervals and then plays the SRAM with the correct delay after that. I have set up the scope to only trigger once (I think before we were getting false pulse sequences because it was triggering multiple times with the messed up Z pulses).

andrewdunsworth commented 8 years ago

When we put the adc_1 monitor on the SRAM write it seems like it writes at the same place for both the 'good' pulse sequence and the 'messed up' one:

good: 20151209_134024

messed: 20151209_142022

andrewdunsworth commented 8 years ago

We put blue on monitor_0 = 5 which is the SRAM running option. It seems to know that it is running SRAM the entire time as the pulses look like this:

Good: 20151209_142022 Bad:

maffoo commented 8 years ago

@andrewdunsworth, do you have packet wireshark packet captures for two runs like these? It would be really helpful to see exactly what got sent to the boards.

andrewdunsworth commented 8 years ago

Ok so if we set blue to IDLE and pink to END (I'm guessing these are the jump table commands), it looks like: good: 20151209_144249 bad: 20151209_144228

So it seems to think that it is idling for the entire length of the messed up pulse sequence... which doesn't make sense (notice that there is a slight down tick where the actual idle occurs at the end of the messed up sequence).

andrewdunsworth commented 8 years ago

This really makes it look like its a jumptable problem and not a hardware problem.

andrewdunsworth commented 8 years ago

We are working on getting wireshark up and running.

andrewdunsworth commented 8 years ago

We now have wireshark but are unsure how to parse the data.

andrewdunsworth commented 8 years ago

We are now sharking wires.

labrad / servers

Jump Table Commands in Wrong Order #281