m-labs / artiq

A leading-edge control system for quantum information experiments
https://m-labs.hk/artiq
GNU Lesser General Public License v3.0
434 stars 201 forks source link

Clocking, DAC support and JESD synchronization on one Sayma card #794

Closed sbourdeauducq closed 6 years ago

sbourdeauducq commented 6 years ago

On my test:

[     6.513475s]  INFO(board_artiq::ad9154):   phase min: 58, phase max: 121, phase opt: 89
[     9.581414s]  INFO(board_artiq::ad9154):   phase min: 58, phase max: 121, phase opt: 89
sbourdeauducq commented 6 years ago
[     7.734548s]  INFO(board_artiq::ad9154):   phase: 0, sync error: 0
...
[     8.603360s]  INFO(board_artiq::ad9154):   phase: 58, sync error: 481
...
[     9.576190s]  INFO(board_artiq::ad9154):   phase: 122, sync error: 482
enjoy-digital commented 6 years ago

Everything should be implemented, before closing we need to:

hartytp commented 6 years ago

@enjoy-digital I'll try to have a go at some of that tomorrow.

Is the phase scan implemented in the code by default atm?

hartytp commented 6 years ago

Thanks for all the work you've done on this recently. Things are starting to shape up nicely.

enjoy-digital commented 6 years ago

@hartytp: yes the phase scan is implemented in the code. If you do a test, can you post your results here? (it will allow us to know if we case use same values for all boards).

hartytp commented 6 years ago

Will do.

What are you doing for the SC1 test? Just looking for DAC realignment events by reading out the register values, or looking at RF outputs on a scope?

enjoy-digital commented 6 years ago

We modify the phase until having at least 2 realignments. (since we are not sure to have the full scan for the first one) Then we put the phase in the middle of 2 realignements and trigger another sync on the DAC. I don't have the equipment to test SC1 on the RF output. If you have it and can do the test, we'll be happy to have the results :)

hartytp commented 6 years ago

Okay, I'll start with your phase scan and see how I get on. If things go well, I'll have a look on my scope.

enjoy-digital commented 6 years ago

Thanks!

sbourdeauducq commented 6 years ago

I don't have the equipment to test SC1 on the RF output.

You do; even a cheap DSO can do it with some signal processing (and a long enough sample buffer). Set up two DACs to output continuous 51MHz sine tones (with a startup kernel). Here are some hacky scripts I was using (based on code by @jordens):

import asyncio
import ds1xxxz
import os
import time

def main():
    for i in range(30):
        os.system("echo > /dev/ttyACM0")
        time.sleep(6)
        os.system("sh load_rtm")
        time.sleep(60)
        print("getting waveforms", i)
        ds1xxxz.save_waveforms("192.168.1.132", 5555, "waveforms_{:02d}.npz".format(i))
    asyncio.get_event_loop().close()

if __name__ == "__main__":
    main()

(You need to set up the scope before using that first script, to have the two channels + a large memory with iirc 120k samples)

import sys
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy import signal, constants

nrows = 10
ncols = 3

matplotlib.rcParams.update({'font.size': 6})

for i in range(30):
    filename = "waveforms_{:02d}.npz".format(i)
    globals().update(np.load(filename))
    t = np.arange(y1.shape[0])*x[0] + x[1]
    y = np.c_[y1, y2].T
    z = signal.decimate(y*np.exp(1j*2*np.pi*50e6*t), q=100, ftype="fir", zero_phase=True)[:, 100:]
    z = signal.decimate(z, q=100, ftype="fir", zero_phase=True)[:, 100:]
    angle = np.angle(np.mean(z[0]*z[1].conj()))

    plt.subplot(nrows, ncols, i+1).set_title("{} {:.4f}".format(filename, angle))
    plt.plot(t[:120], y[:, :120].T)
plt.savefig("sayma.pdf")
plt.show()
enjoy-digital commented 6 years ago

OK thanks, maybe i have the equipment, but you are asking me to do system level tests while i'm not relevant for this (without your previous answer i would have no idea how to test it). So let's keep every one its job:

sbourdeauducq commented 6 years ago

if you never try things you have no idea about, you never learn :P

hartytp commented 6 years ago

I'm happy to have a go at this once my Sayma arrives back, which I hope will be today (@gkasprow do you have a tracking number for it btw?)

To confirm, the two DACs should currently have deterministic latency with respect to the input 1.2GHz clock, but with respect to the RTIO clock, right? So, if I set channels on different DACs to the same frequency and phase then the phase offset between them should be constant, right? Similarly, if I set the DAC output to a sub-multiple of the 1.2GHz clock frequency, I should see a constant phase delay between the RF output and the DAC clock, right?

enjoy-digital commented 6 years ago

@sbourdeauducq: sure, that's also what i think and how i learn. But that's not something you can always do. I'll be happy to learn on that, but maybe later, for now i'm just trying being practical since i have limited time and would rather focus on things i'm really relevant for.

sbourdeauducq commented 6 years ago

@hartytp just compare 2 DAC channels from each DAC chip. The rest is not necessarily det-lat yet.

hartytp commented 6 years ago

ack

@hartytp just compare 2 DAC channels from each DAC chip. The rest is not necessarily det-lat yet.

edit: to check I understand you here, you want me to compare the relative phases of a pair of channels on the same DAC (i.e. not comparing the phases between DAC chips). I thought that was always deterministic, without any synchronization work required.

sbourdeauducq commented 6 years ago

No, I meant comparing between the two DAC chips. Take one channel from AD9154-0 and one channel from AD9154-1.

hartytp commented 6 years ago

Okay, good. I'll do that once my AMC+RTM turn up (tomorrow?).

sbourdeauducq commented 6 years ago

@enjoy-digital The phase shift value definitely needs to be fixed; I'm getting intermittently 56 or 88 for both DACs (independently) of one board.

sbourdeauducq commented 6 years ago

Also the logs are excessively verbose, it would be sufficient to print only when the error count is first measured and then changes.

enjoy-digital commented 6 years ago

I reduced the logs.

Now for the intermittently results, let's try to analyze that:

Combined together, we can see this as 26ps (416ps/16) resolution configurable delay . (This no really what is happening since we have a gap between between 400 and 416 ps, but we use that for the analysis).

So when testing on board:

Not sure what to conclude from that now, at least i see that theses intermittently results are not coming from corner cases in the sysref scan, but are the behaviour of the system.

sbourdeauducq commented 6 years ago

DAC synch is not working; the DACs just have random phases wrt each other after each power-up. This can be measured with the script below that uses the Red Pitaya currently connected in the lab, and sines.py modified to output 9MHz (currently flashed as startup kernel). From what I observed, the phase varies within a 0 to 0.38 radian window of the 9MHz waveform, i.e. 6.7ns. This corresponds to 1 period of the 150MHz clock.

import socket
import numpy as np
import matplotlib.pyplot as plot
from scipy import signal, constants

class RPSCPI:
    def connect(self, host):
        self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.sock.connect((host, 5000))
        self.sock_f = self.sock.makefile()

    def close(self):
        self.sock_f.close()
        self.sock.close()

    def sendmsg(self, msg):
        self.sock.send(msg.encode() + b"\r\n")

    def recvmsg(self):
        return self.sock_f.readline().strip()

    def trigger(self):
        rp.sendmsg("ACQ:START")
        rp.sendmsg("ACQ:TRIG NOW")
        while True:
            rp.sendmsg("ACQ:TRIG:STAT?")
            if rp.recvmsg() == "TD":
                break

    def get_data(self, channel):
        rp.sendmsg("ACQ:SOUR{}:DATA?".format(channel))
        buff_string = rp.recvmsg()[1:-1].split(',')
        return np.array(list(map(float, buff_string)))

rp = RPSCPI()
rp.connect("192.168.1.200")
try:
    rp.trigger()
    y1 = rp.get_data(1)
    y2 = rp.get_data(2)

    t = np.arange(y1.shape[0])/125e6
    y = np.c_[y1, y2].T
    z = signal.decimate(y*np.exp(1j*2*np.pi*9e6*t), q=10, ftype="fir", zero_phase=True)[:, 10:]
    z = signal.decimate(z, q=10, ftype="fir", zero_phase=True)[:, 10:]
    angle = np.angle(np.mean(z[0]*z[1].conj()))

    print(angle)

    # optional, check visually that we are not measuring noise
    plot.plot(y1)
    plot.plot(y2)
    plot.show()

finally:
    rp.close()
hartytp commented 6 years ago

I think that's expected with the current state of the code @sbourdeauducq since we don't issue a reset command after configuring the dividers.

From a quick skim over the data sheet, I think that to enable synchronisation you need to edit these lines to enable synchronisation for all channels by setting bit 6 high:

https://github.com/m-labs/artiq/blob/db4d1878d3528d65b9cace5a151030fa125648a2/artiq/firmware/libboard_artiq/hmc830_7043.rs#L261-L266

should be

if enabled {
                // Only clock channels need to be high-performance
                if (channel % 2) == 0 { write(channel_base, 0xD1); }
                else { write(channel_base, 0x51); }
            }
            else { write(channel_base, 0x10); }

cf page 39 of the data sheet.

Then, send a sync request via spi as the last part of the init.

Edit: NB multislip delays are only required for CMOS outputs to get deterministic latency. For LVPECL (which we use) this is not required.

hartytp commented 6 years ago

As for sychronising the dividers via SPI, see the "typical programming sequence" section.

Essentially, one just needs to play with register 0x01.

I think it's enough just to add a reseed request after configuring all channels by adding write(0x01, 0xc0) to the end of the init.

If that doesn't work then try resetting the dividers + FSMs before the reseed request via

write(0x01, 0x42) // Reset dividers
write(0x01, 0x40) // high-performance, low-noise mode

If you have any issues after doing that, try reading out the alarm register (make sure to set the appropriate mask) to check that there has been a resync attempt, and that the phases are stable.

@sbourdeauducq can you give that a try?

hartytp commented 6 years ago

Maybe I'm missing something here, but I thought that @enjoy-digital's phase adjustment algorithm gave reproducible results across power-cycles. Yet, from looking at the code, I think the output channels should random (n/1.2GHz) phase offsets between power cycles. Not sure what's up there, but should be obvious when looked at with a fast scope.

sbourdeauducq commented 6 years ago

It did something, but there are still major problems (this is Sayma after all). Some results:

[     8.458470s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    11.233409s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
0.28627736427876727

[     8.244383s]  INFO(board_artiq::ad9154):   phase min: Some(26), phase max: Some(89)
[    10.728289s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
-0.08945706092316498

[     8.458471s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    10.942449s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
0.28640013106909185

[    13.522800s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    15.681677s]  INFO(board_artiq::ad9154):   phase min: Some(26), phase max: Some(89)
0.38043811927970395

[     8.458470s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    10.617313s]  INFO(board_artiq::ad9154):   phase min: Some(26), phase max: Some(89)
0.4274499995383124

[     8.458470s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    10.942448s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
0.2864263089443654

[     8.578176s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    11.456094s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
-0.04248751461332089

[     8.458473s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    10.617314s]  INFO(board_artiq::ad9154):   phase min: Some(26), phase max: Some(89)
0.3802849758152413

26/89 on first DAC
[    16.293076s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
0.19254855542815777

[    12.629471s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
[    15.404492s]  INFO(board_artiq::ad9154):   phase min: Some(58), phase max: Some(121)
0.28653390064741524

This is with reseed only, resetting the dividers+FSM doesn't seem to fix anything.

It may be just a coincidence, but it appears to work sometimes when both DACs yield 58/121 during the sysref scan (which makes some sense since we set sysref to 88).

hartytp commented 6 years ago

@sbourdeauducq So, do you think that the issue is now with the phase finding code?

In any case, I don't think there is much point me looking at this further until I have access to the hw.

e.g. if I were debugging this, my next step would be to physically check that after reseeding the HMC7043 outputs all have deterministic relative phases. If the issue is still outstanding when I get my hands on working HW, I can perform some quick measurements to help you track this issue down.

hartytp commented 6 years ago

(Or, @enjoy-digital is welcome to come to our labs and test this using our T & M equipment next time he's in the UK).

sbourdeauducq commented 6 years ago

The phase scan code is not configuring any chip based on the results. I was just trying to correlate the two, as I am surprised by how non-deterministic the phase scanning code is, too.

hartytp commented 6 years ago

To look at this, I changed the FPGA_CLOCK divider to 12 (100MHz output) and looked at J61 on a fast scope triggered from my 100MHz reference. I can confirm that the HMC7043 configuration currently used in ARTIQ master does not provide deterministic latency. I'll apply the patch I proposed above and recheck.

hartytp commented 6 years ago

Nope, even with that patch, the 7043 outputs don't have deterministic latency w.r.t. the reference.

hartytp commented 6 years ago

Will dig into this further tomorrow.

sbourdeauducq commented 6 years ago

Great, thanks for all your help!

jbqubit commented 6 years ago

Analog Devices talks about deterministic latency of HMC7043 this in a report on a huge clock tree.

http://www.analog.com/en/technical-articles/synchronizing-sample-clocks-of-a-data-converter-array.html

hartytp commented 6 years ago

Joe I'm being daft and measuring the wr on thing. That was never going to work as I was measuring ref to hmc output phase which we can't control. Should have measured phase between hmc7043 outputs!

hartytp commented 6 years ago

Well can't = can't via SPI alone

hartytp commented 6 years ago

One more time with brain attached. Now looking at the two 150MHz outputs from the HMC7043. I can confirm that the relative phases of these outputs is not stable across FPGA loads with the current ARTIQ master.

I'll apply the patch and see if that fixes it.

hartytp commented 6 years ago

Patch doesn't fix it because we also need to configure the internal SYSREF timer. Doing that now.

hartytp commented 6 years ago

The relative phases of the HMC7043 outputs are fixed as of https://github.com/m-labs/artiq/pull/1049

sbourdeauducq commented 6 years ago

SC1 now seems to work intermittently. Maybe the desynch is simply due to the JESD204 elastic buffer bug. Here are the phase differences in a 9MHz waveform that I measured between the two DACs, reloading the bitstream every time. Unit is radian. -3.0122085695401912 0.003484989809197922 0.0035013579182848783 0.003437719904394562 0.0034271564472742777 -3.012130036638558 0.0034841246460592993 0.0035376824528900223 0.0035230510713831747 0.003476287717748864 3.0188539470060625 0.0034901997337270447 -3.012092422739266 3.0190223329751356 -3.0119953449379806

There seems to be a resurgence of serwb bugs, which makes this testing a bit annoying.

sbourdeauducq commented 6 years ago

After JESD update:

0.379348417299257 0.3321831450064115 0.3793770410950291 0.37937666640012396 0.37935438836814794 0.3793441872220443 0.3793281246479074 0.3794722930935313 0.37942851699262903 0.0036921377386391853 0.3793404124933203

Observation: the difference between the second result and the usual value is 831ps, which is close to the 833ps period of the 1.2GHz clock.

hartytp commented 6 years ago

getting there...

jbqubit commented 6 years ago

That’s great!

hartytp commented 6 years ago

Looking at this on my Sayma.

Running the following startup kernel:

from artiq.experiment import *

class SAWGTest(EnvExperiment):
    def build(self):
        self.setattr_device("core")
        self.sawgs = [self.get_device("sawg{}".format(i)) for i in range(8)]

    @kernel
    def run(self):
        self.core.reset()
        for sawg in self.sawgs:
            sawg.reset()
            delay(300 * us)

        for sawg in self.sawgs:
            sawg.frequency0.set(10*MHz)
            sawg.amplitude1.set(1.)
            delay(10*us)

Looking at the phase between 1 channel on each DAC using a fast scope. NB 1 deg at 10MHz = 277ps, 1 deg at 600MHz (DAC clock) = 1.66ns

Over 15 loads of the FPGAs, including 3 power cylces, I've seen the same phase difference between the two channels to 1deg (measurement accuracy). So, this seems to work just fine for me.

Good!

hartytp commented 6 years ago

Cool, so we have working SAWG and SC1. If we can just fix the remaining crashes/serwb issues then we're sorted.

jbqubit commented 6 years ago

Running 4.0.dev+1133.g0b086225. Compare phase between a pair of channels spanning both DACs on single Sayma. Set frequency0 to 40 MHz. Measured phase difference using Tek FCA 3003 Timer using 10k samples of 10 ms intervals. Saw stdv < 1 deg with peak deviation < 9 deg. Cycling off power and reloading 5 times I see variation in mean relative phase of -96.11, -95.46, -95.66, -96.13, -95.50 deg.

hartytp commented 6 years ago

40MHz, 1 deg = 70ps. So, 9deg = 625ps. Given that the HMC830 clock is 1.2GHz=833ps, this sounds like you have seen the two DACs synchronised to better than one clock cycle.

Closing this issue.

jbqubit commented 6 years ago

I got the fancy scope back so can make a better measurement. Scope and 100 MHz clock to Sayma are phase locked. Still using 4.0.dev+1133.g0b086225 .

First, compare phase between a pair of channels on single DAC. Here, frequency0 is 210 MHz. No skew, as expected.

tek006_000

Now compare DAC channels spanning the pair of chips. frequency0 is 10 MHz . Cycle power and reload .bit's in between. I'm generally getting two results for mean skew with a separation of f_RTIO_coarse. 1/(150 MHz) = 6.7 ns 6.72 ns 6.85 ns -0.18 ns 6.82 ns 0.13 ns

Expected outcome is skew < 1/f_DAC between board resets. Please reopen Issue.

tek006_007 tek006_008 tek006_009 tek006_010 tek006_011

hartytp commented 6 years ago

@jbqubit thanks for checking that more carefully. I don't think I can reproduce this on my board...

@enjoy-digital am I right in thinking: