m-labs / artiq

A leading-edge control system for quantum information experiments
https://m-labs.hk/artiq
GNU Lesser General Public License v3.0
435 stars 201 forks source link

sayma ramp gen issues #1166

Closed hartytp closed 4 years ago

hartytp commented 6 years ago

Building in a fresh artiq-dev=4.0.dev conda environment with the latest master. Only change is to set the HMC830 reference frequency to 150MHz in the Sayma AMC target. Clocking at 150MHz from a synth.

Building without sawg, I see glitches on the ramp generator's output

sync

Some channels look better than others, but all have some level of glitch. This could well be related to the synchronisation issues I see...

sbourdeauducq commented 6 years ago

@marmeladapk / @jbqubit are you able to reproduce this?

hartytp commented 6 years ago

Reproduced in a fresh conda environment with a current master, now clocking at 100MHz. Binaries are here, timing is met.

conda list
# packages in environment at /home/tph/anaconda3/envs/artiq-dev:
#
# Name                    Version                   Build  Channel
aiohttp                   3.1.3                    py35_0    conda-forge/label/main
alabaster                 0.7.12                     py_0    conda-forge/label/main
artiq                     4.0.dev0+1408.gd0ee2c29           <pip>
artiq-dev                 4.0.dev         1408+gitd0ee2c29    m-labs/label/dev
asn1crypto                0.24.0                   py35_3    conda-forge/label/main
async-timeout             2.0.1                    py35_0    conda-forge/label/main
asyncserial               0.1             py_13+git340e430    m-labs/label/main
attrs                     18.2.0                     py_0    conda-forge/label/main
babel                     2.6.0                      py_1    conda-forge/label/main
binutils-or1k-linux       2.30                          7    m-labs/label/main
blas                      1.0                         mkl  
bscan-spi-bitstreams      0.10.0                        2    m-labs/label/main
bzip2                     1.0.6                h470a237_2    conda-forge/label/main
ca-certificates           2018.8.24            ha4d7672_0    conda-forge/label/main
certifi                   2018.8.24             py35_1001    conda-forge/label/main
cffi                      1.11.5           py35h5e8e0c9_1    conda-forge/label/main
chardet                   3.0.4                    py35_3    conda-forge/label/main
colorama                  0.3.9                      py_1    conda-forge/label/main
coverage                  4.5.1            py35h470a237_1    conda-forge/label/main
cryptography              2.3.1            py35hdffb7b8_0    conda-forge/label/main
cryptography-vectors      2.3.1                    py35_0    conda-forge/label/main
dbus                      1.13.0               h3a4f0e9_0    conda-forge/label/main
docutils                  0.14                     py35_1    conda-forge/label/main
expat                     2.2.5                hfc679d8_2    conda-forge/label/main
fontconfig                2.13.1               h65d0f4c_0    conda-forge/label/main
freetype                  2.9.1                h6debe1e_4    conda-forge/label/main
gettext                   0.19.8.1             h5e8e0c9_1    conda-forge/label/main
glib                      2.55.0               h464dc38_2    conda-forge/label/main
gst-plugins-base          1.12.5               hde13a9d_0    conda-forge/label/main
gstreamer                 1.12.5               h61a6719_0    conda-forge/label/main
h5py                      2.8.0            py35h7eb728f_2    conda-forge/label/main
hdf5                      1.10.2               hc401514_2    conda-forge/label/main
icu                       58.2                 hfc679d8_0    conda-forge/label/main
idna                      2.7                      py35_2    conda-forge/label/main
idna_ssl                  1.0.0                         0    conda-forge/label/main
imagesize                 1.1.0                      py_0    conda-forge/label/main
intel-openmp              2019.0                      118  
jesd204b                  0.10                       py_1    m-labs/label/main
jinja2                    2.10                       py_1    conda-forge/label/main
jpeg                      9c                   h470a237_1    conda-forge/label/main
levenshtein               0.12.0                   py35_1    m-labs/label/main
libcurl                   7.61.0               h1ad7b7a_0  
libffi                    3.2.1                hfc679d8_5    conda-forge/label/main
libgcc-ng                 7.2.0                hdf63c60_3    conda-forge/label/main
libgfortran               3.0.0                         1    conda-forge/label/main
libgfortran-ng            7.2.0                hdf63c60_3    conda-forge/label/main
libgit2                   0.24.1                        7    m-labs/label/main
libiconv                  1.15                 h470a237_3    conda-forge/label/main
libpng                    1.6.35               ha92aebf_2    conda-forge/label/main
libssh2                   1.8.0                h5b517e9_2    conda-forge/label/main
libstdcxx-ng              7.2.0                hdf63c60_3    conda-forge/label/main
libusb                    1.0.20                        0    m-labs/label/main
libuuid                   2.32.1               h470a237_2    conda-forge/label/main
libxcb                    1.13                 h470a237_2    conda-forge/label/main
libxml2                   2.9.8                h422b904_5    conda-forge/label/main
lit                       0.4.1                      py_9    m-labs/label/main
llvm-or1k                 6.0.0                        25    m-labs/label/main
llvmlite-artiq            0.23.0.dev               py35_4    m-labs/label/main
markupsafe                1.0              py35h470a237_1    conda-forge/label/main
microscope                1.3                        py_1    m-labs/label/main
migen                     0.7             py35_73+gitbef9dea    m-labs/label/dev
misoc                     0.11            py35_31+git5ce139dd    m-labs/label/dev
mkl                       2019.0                      118  
mkl_fft                   1.0.6                    py35_0    conda-forge/label/main
mkl_random                1.0.1                    py35_0    conda-forge/label/main
msgpack-python            0.5.6            py35h2d50403_3    conda-forge/label/main
multidict                 4.4.2            py35h470a237_0    conda-forge/label/main
ncurses                   6.1                  hfc679d8_1    conda-forge/label/main
numpy                     1.15.0           py35h1b885b7_0  
numpy-base                1.15.0           py35h3dfced4_0  
openocd                   0.10.0                        6    m-labs/label/main
openssl                   1.0.2p               h470a237_0    conda-forge/label/main
outputcheck               0.4.2                      py_7    m-labs/label/main
pcre                      8.41                 hfc679d8_3    conda-forge/label/main
pip                       18.0                  py35_1001    conda-forge/label/main
prettytable               0.7.2                      py_2    conda-forge/label/main
pthread-stubs             0.4                  h470a237_1    conda-forge/label/main
pycparser                 2.19                       py_0    conda-forge/label/main
pygit2                    0.24.0                   py35_4    m-labs/label/main
pygments                  2.2.0                      py_1    conda-forge/label/main
pyopenssl                 18.0.0                   py35_0    conda-forge/label/main
pyqt                      5.6.0            py35h8210e8a_7    conda-forge/label/main
pyqtgraph                 0.10.0                     py_5    conda-forge/label/main
pyserial                  3.4                      py35_0    conda-forge/label/main
pysocks                   1.6.8                    py35_2    conda-forge/label/main
python                    3.5.5                h5001a0f_2    conda-forge/label/main
python-dateutil           2.7.3                      py_0    conda-forge/label/main
pythonparser              1.1                        py_8    m-labs/label/main
pytz                      2018.5                     py_0    conda-forge/label/main
qt                        5.6.2                hf70d934_9    conda-forge/label/main
quamash                   0.5.5                      py_4    m-labs/label/main
readline                  7.0                  haf1bffa_1    conda-forge/label/main
regex                     2015.11.22               py35_1    m-labs/label/main
requests                  2.19.1                   py35_1    conda-forge/label/main
rust-core-or1k            1.28.0                       21    m-labs/label/main
rustc                     1.28.0                       21    m-labs/label/main
scipy                     1.1.0            py35hfc37229_0  
setuptools                33.1.1                   py35_0    conda-forge/label/main
sip                       4.18.1           py35hfc679d8_0    conda-forge/label/main
six                       1.11.0                   py35_1    conda-forge/label/main
snowballstemmer           1.2.1                      py_1    conda-forge/label/main
sphinx                    1.4.8                    py35_0    conda-forge/label/main
sphinx-argparse           0.1.13                     py_4    m-labs/label/main
sphinx_rtd_theme          0.4.1                      py_0    conda-forge/label/main
sphinxcontrib-wavedrom    1.1.0                     <pip>
sphinxcontrib-wavedrom    1.1.0                      py_1    m-labs/label/main
sqlite                    3.25.2               hb1c47c0_0    conda-forge/label/main
system                    5.8                           2  
tk                        8.6.8                ha92aebf_0    conda-forge/label/main
urllib3                   1.23                     py35_1    conda-forge/label/main
wheel                     0.32.0                py35_1000    conda-forge/label/main
xorg-libxau               1.0.8                h470a237_6    conda-forge/label/main
xorg-libxdmcp             1.1.2                h470a237_7    conda-forge/label/main
xz                        5.2.4                h470a237_1    conda-forge/label/main
yarl                      1.2.6            py35h470a237_0    conda-forge/label/main
zlib                      1.2.11               h470a237_3    conda-forge/label/main

UART log:

 __  __ _ ____         ____ 
|  \/  (_) ___|  ___  / ___|
| |\/| | \___ \ / _ \| |    
| |  | | |___) | (_) | |___ 
|_|  |_|_|____/ \___/ \____|

MiSoC Bootloader
Copyright (c) 2017-2018 M-Labs Limited

Bootloader CRC passed
Gateware ident 4.0.dev+1408.gd0ee2c29;standalone.without-sawg
Initializing SDRAM...
DQS initial delay: 110 taps
Write leveling scan:
Module 3:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100110110000000000000000000000000000000000000000
Module 2:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011010000000000000000000000000
Module 1:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001100000000000000000000000000000000000000000000000000000000000000
Module 0:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110010000000000000000000000000000000000000000000000000000000000000
DQS initial delay: 110 taps
Write leveling: 107 112 138 130 done
Read leveling scan:
Module 3:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 2:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 1:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Module 0:
00000000000000000000000000000000000000000000000000000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Read leveling: 227+-83 215+-92 192+-84 179+-91 done
SDRAM initialized
Memory test passed

Booting from flash...
Starting firmware.
[     0.000005s]  INFO(runtime): ARTIQ runtime starting...
[     0.003867s]  INFO(runtime): software ident 4.0.dev+1408.gd0ee2c29;standalone.without-sawg
[     0.012131s]  INFO(runtime): gateware ident 4.0.dev+1408.gd0ee2c29;standalone.without-sawg
[     0.020404s]  INFO(runtime): log level set to INFO by default
[     0.026116s]  INFO(runtime): UART log level set to INFO by default
[     0.032261s]  INFO(board_artiq::slave_fpga): Loading slave FPGA gateware...
[     0.039211s]  INFO(board_artiq::slave_fpga):   magic: 0x5352544d, length: 0x000bf050
[     0.046937s]  INFO(board_artiq::slave_fpga):   DONE before loading
[     1.046830s]  INFO(board_artiq::slave_fpga):   ...done
[     1.050643s]  INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...
[     1.085372s]  INFO(board_artiq::serwb):  ...done.
[     1.088745s]  INFO(board_artiq::serwb): RTM to AMC link test...
[     2.571047s]  INFO(board_artiq::serwb):   ...passed
[     2.574594s]  INFO(board_artiq::serwb): AMC to RTM link test...
[     4.056902s]  INFO(board_artiq::serwb):   ...passed
[     4.060459s]  INFO(board_artiq::serwb): Wishbone test...
[     5.992399s]  INFO(board_artiq::serwb):   ...passed
[     5.996196s]  INFO(board_artiq::serwb): RTM gateware version 4.0.dev+1408.gd0ee2c29
[     6.003589s]  INFO(runtime): press 'e' to erase startup and idle kernels...
[     7.003006s]  INFO(runtime): continuing boot
[     7.265246s]  INFO(board_artiq::si5324): waiting for Si5324 lock...
[    13.555740s]  INFO(board_artiq::si5324):   ...locked
[    13.559508s]  INFO(board_artiq::hmc830_7043::hmc830): loading HMC830 configuration...
[    13.567442s]  INFO(board_artiq::hmc830_7043::hmc830):   ...done
[    13.573097s]  INFO(board_artiq::hmc830_7043::hmc830): setting HMC830 dividers...
[    13.580561s]  INFO(board_artiq::hmc830_7043::hmc830):   ...done
[    13.586381s]  INFO(board_artiq::hmc830_7043::hmc830): waiting for HMC830 lock...
[    13.593802s]  INFO(board_artiq::hmc830_7043::hmc830):   ...locked
[    13.600032s]  INFO(board_artiq::hmc830_7043::hmc7043): enabling HMC7043
[    13.616733s]  INFO(board_artiq::hmc830_7043::hmc7043): loading configuration...
[    13.634220s]  INFO(board_artiq::hmc830_7043::hmc7043):   ...done
[    13.638903s]  INFO(board_artiq::hmc830_7043::hmc7043): testing GPO...
[    13.646016s]  INFO(board_artiq::hmc830_7043::hmc7043):   ...passed
[    13.662230s]  INFO(board_artiq::ad9154): AD9154-0 initializing...
[    13.674111s]  INFO(board_artiq::ad9154):   ...done
[    13.747991s]  INFO(board_artiq::ad9154): AD9154-0 running PRBS test...
[    14.754355s]  INFO(board_artiq::ad9154):   ...passed
[    14.758006s]  INFO(board_artiq::ad9154): AD9154-0 running STPL test...
[    14.764795s]  INFO(board_artiq::ad9154):   c0 errors: 0
[    14.769998s]  INFO(board_artiq::ad9154):   c1 errors: 0
[    14.775208s]  INFO(board_artiq::ad9154):   c2 errors: 0
[    14.780418s]  INFO(board_artiq::ad9154):   c3 errors: 0
[    14.785342s]  INFO(board_artiq::ad9154):   ...passed
[    14.800325s]  INFO(board_artiq::ad9154): AD9154-0 initializing...
[    14.807742s]  INFO(board_artiq::ad9154):   ...done
[    14.892231s]  INFO(board_artiq::ad9154): AD9154-1 initializing...
[    14.903958s]  INFO(board_artiq::ad9154):   ...done
[    14.977827s]  INFO(board_artiq::ad9154): AD9154-1 running PRBS test...
[    15.984184s]  INFO(board_artiq::ad9154):   ...passed
[    15.987833s]  INFO(board_artiq::ad9154): AD9154-1 running STPL test...
[    15.994618s]  INFO(board_artiq::ad9154):   c0 errors: 0
[    15.999827s]  INFO(board_artiq::ad9154):   c1 errors: 0
[    16.005037s]  INFO(board_artiq::ad9154):   c2 errors: 0
[    16.010247s]  INFO(board_artiq::ad9154):   c3 errors: 0
[    16.015169s]  INFO(board_artiq::ad9154):   ...passed
[    16.030154s]  INFO(board_artiq::ad9154): AD9154-1 initializing...
[    16.037573s]  INFO(board_artiq::ad9154):   ...done
[    16.111459s]  INFO(board_artiq::jesd204sync): aligning SYSREF with RTIO...
[    16.127028s]  INFO(board_artiq::jesd204sync):   ...done (0/62 slips)
[    16.135242s]  INFO(board_artiq::jesd204sync):   margins at FPGA: -18 +16
[    16.140650s]  INFO(board_artiq::jesd204sync): calibrating SYSREF phase at DAC-0...
[    16.149429s] ERROR(runtime): failed to align SYSREF at DAC: no sync lock
[    16.154879s]  INFO(board_artiq::hmc542): card 0 channel 0 set to 4 dB
[    16.163391s]  INFO(board_artiq::hmc542): card 0 channel 1 set to 4 dB
[    16.170608s]  INFO(board_artiq::hmc542): card 1 channel 0 set to 4 dB
[    16.177825s]  INFO(board_artiq::hmc542): card 1 channel 1 set to 4 dB
[    16.185042s]  INFO(board_artiq::hmc542): card 2 channel 0 set to 4 dB
[    16.192260s]  INFO(board_artiq::hmc542): card 2 channel 1 set to 4 dB
[    16.199477s]  INFO(board_artiq::hmc542): card 3 channel 0 set to 4 dB
[    16.206694s]  INFO(board_artiq::hmc542): card 3 channel 1 set to 4 dB
[    16.213940s]  WARN(runtime): using default MAC address 02-00-00-00-00-11; consider changing it
[    16.221314s]  INFO(runtime): using default IP address 192.168.1.60
[    16.228970s]  INFO(runtime::mgmt): management interface active
[    16.242120s]  INFO(runtime::session): accepting network sessions
[    16.256063s]  INFO(runtime::session): running startup kernel
[    16.282968s]  INFO(runtime::kern_hwreq): resetting RTIO
hartytp commented 6 years ago

Looking at all channels on a fast scope. Using an ac-coupled TCM2-43X+ and a 50Ohm scope. Channel numbers refer to the physical ordering of the SMPs, with the ones nearest the SATA connectors being 7 (I recall the mapping between SMPs and DAC channels being somewhat screwy).

0 0

1 1

2 2

3 3

4 4

5 5

6 6

7 7

hartytp commented 6 years ago

So we have these odd glitches and channels 3/7 show some odd curvature...

@jbqubit @marmeladapk @gkasprow can one of you look at your board with the binaries I posted and see if you reproduce this...

jordens commented 6 years ago

The "odd curvature" (assuming you are talking about the low frequency deviation from the sawtooth) on 1,3,5,7 is your balun AFAICT. Do they worry you? There are glitches on 0,1,4,5: periodic on 1,5 and random on 0,4.

jbqubit commented 6 years ago

Using @hartytp build I also see problems with the sawtooth pattern. When errors occur they don't occur with the same period as the sawtooth so I've done a single-trigger capture. sawg0...sawg7

20181005_ramp_sawg__000 20181005_ramp_sawg__001 20181005_ramp_sawg__002 20181005_ramp_sawg__003 20181005_ramp_sawg__004 20181005_ramp_sawg__005 20181005_ramp_sawg__006 20181005_ramp_sawg__007

sbourdeauducq commented 6 years ago

Is there an older version where that did not happen?

hartytp commented 6 years ago

Is there an older version where that did not happen?

Not sure. Are you able to test this on your hardware? Or, is there some other issue blocking that?

We could go back and check with the commit that we though fixed this. If you post binaries then maybe @jbqubit can test them (I won't be able to do that this week)? Since this issue only affects some channels, it's quite possible that we thought it was fixed when it wasn't. Or, it could be one of those PITA issues like an incorrect reset sequence or CDC or something that only causes issues with some builds.

sbourdeauducq commented 6 years ago

Generally lacking time + need to fix the power supply or microTCA.

gkasprow commented 6 years ago

@sbourdeauducq Is there a way to get access to your MTCA crate? I mean MCH Ethernet, NAT MCH console, Sayma USB + MMC programmer + some mean to switch the power of the crate ?

jordens commented 6 years ago

Given that the behavior differs between channels (where the ramp generator does not differ), this is likely something downstream and not the ramp generator.

jordens commented 6 years ago

@hartytp to clarify what you were saying:

hartytp commented 6 years ago

Is it known whether or not this happens with SAWG as well?

@jbqubit

I haven't got enough data at the moment to give a definitive answer to that, although I can retest later if necessary.

I only looked at one channel with SAWG and didn't see any evidence of this (see the shots I posted on the synchronisation issue). That same channel displayed glitches with the ramp generator.

However, without knowing the cause of this issue, it's possible that the presence/absence of glitches depends on exactly what RF waveform is produced on the various channels, so my one observation doesn't necessarily tell us much.

To frame this, when do you (or anyone) remember checking the outputs the last time? Before the synchronization/sysref work?

This is not related to the SYSREF work. @jbqubit sees this on an unmodified Sayma using the current ARTIQ master. I see it both with master and with my sync branch.

I did some tests looking at glitches shortly after the current JESD204B release (see the IRC logs). It's been a while now, so I can't recall if I verified that the glitches were absent on all channels, or if I just confirmed that after that release the channel I had been looking at no longer had glitches (i.e. I'm not sure if I can confirm whether the glitches disappeared after that release or if they just moved channels).

Once the power supply issue is fixed, it would be great to know if @sbourdeauducq can reproduce this.

marmeladapk commented 6 years ago

@hartytp Did you try to supply other reference frequencies (100, 125 MHz?). Just as a sanity check. With @jbqubit we had very similar symptoms, which were caused by wrong reference frequency (even though hmc830 locked). Perhaps hmc830 freq isn't set properly?

It's a wild guess, but it should be quick to check it.

hartytp commented 6 years ago

@hartytp Did you try to supply other reference frequencies (100, 125 MHz?). Just as a sanity check. With @jbqubit we had very similar symptoms, which were caused by wrong reference frequency (even though hmc830 locked). Perhaps hmc830 freq isn't set properly?

Good thought, but I was quite careful about this. I was also able to reproduce this issue with a variety of clocking options and reference frequencies.

Note also, that @jbqubit reproduced my measurements, so he would have to be making the same mistake as me.

sbourdeauducq commented 5 years ago

The picture here looks suspicious, I don't see large glitches but that's perhaps just because of the limited scope bandwidth. photo_2018-12-22_19-28-55

I don't know what is causing this.

hartytp commented 5 years ago

Did you look at all DAC channels?

sbourdeauducq commented 5 years ago

That's the next thing I'm planning to do, but I ran out of time for today. BTW, this was taken with the board inside the µTCA crate and Ethernet also works (through the crate).

hartytp commented 5 years ago

BTW, this was taken with the board inside the µTCA crate and Ethernet also works (through the crate).

🎆

jbqubit commented 5 years ago

perhaps just because of the limited scope bandwidth.

Your scope has enough vertical and time resolution to see the glitches that I saw. Looks like both @hartytp and I saw that only a subset of DAC channels had the glitches.

https://github.com/m-labs/artiq/issues/1166#issuecomment-427388926

hartytp commented 5 years ago

Digging through the issue trackers to remind myself the history of this issue:

sbourdeauducq commented 5 years ago

Okay, now it won't boot anymore, panic at src/libcore/result.rs:945:5cannot load RTM FPGA gateware: "Did not assert INIT in reaction to PROGRAM" Maybe one of the reworks broke when I added more Allaki to look at other channels...

sbourdeauducq commented 5 years ago

Taking the RTM out of the crate and putting it back in again "fixed" the problem. You gotta love Sayma... There are glitches on some channels: image image

jordens commented 5 years ago

Cleaning this up and categorizing the historic and current issues observed/fixed, we have the following:

  1. The glitches we are looking at in this issue here. Earliest report (and distinct from the others, except maybe (4)) is in this issue up top.
  2. https://github.com/m-labs/artiq/issues/1022#issuecomment-393905927 and https://github.com/sinara-hw/sinara/issues/551#issuecomment-395099535 (https://pasteboard.co/HnT4Gf6.jpg) is from 2018-06-01 and before the JESD changes and before the Cat() changes. Those glitches were also qualitatively different and hat a specific, deterministic pattern (when they occurred) that was addresses in the commits there. They were the dominant issue we isolated, and fixed in those issues.
  3. https://github.com/m-labs/artiq/issues/1022#issuecomment-395895632 may or may not be a false positive since the preceding fix required a misoc update which was frequently overlooked.
  4. The glitches after the ramp jump (https://github.com/sinara-hw/sinara/issues/557#issuecomment-394902405 and https://github.com/sinara-hw/sinara/issues/551#issue-328542083) may or may not be the same as (1). Should be verified by cable length changes.

AFAIR since there were also significant changes to the test pattern generator (without-sawg) afterwards that could have and were intended to converted rare and low visibility glitches (like 4) into more dramatic glitches like (1).

Ergo, let's focus on (1) first and ignore the rest for now.

Since we do not suspect the ramp generator, we should look at JESD.

hartytp commented 5 years ago

Thanks for looking into this @sbourdeauducq and @jordens. Glad to hear you can reproduce the issue. If there are any specific tests you'd like me to do then let me know.

sbourdeauducq commented 5 years ago

The problem is not present with the DRTIO satellite target (@hartytp please confirm). So this looks like a clocking gateware/firmware bug, or more HMC7043 shenanigans (with DRTIO satellite, the JESD transceivers are clocked by the Si5324 and not the HMC).

To test, compile the DRTIO satellite with the --without-sawg flag and then provide a DRTIO uplink on the SFP (with 150MHz RTIO clock, SFP0 labeled "Cage1" on the PCB) which can come from a Kasli or another Sayma acting as DRTIO master. Just establishing the DRTIO link is sufficient, no command has to be sent.

jordens commented 5 years ago

Ha. The glitches are every 40 ns. That's an even 6 coarse RTIO periods and 5 periods of 125 MHz. That's why they look random on the 600/4/16 MHz ramp and periodic on the 600/4/3 MHz sine. There isn't that much (anything?) driven at 125 MHz in DRTIO satellites, but quite a bit in master bitstreams. This could be some fabric or external crosstalk due to the beat and thus PI/SI issues. Or a gateware/firmware bug that's always there and only exposed by the beat.

sbourdeauducq commented 5 years ago

The satellite has the CPU with satman + SDRAM at 125MHz.

sbourdeauducq commented 5 years ago

We can test this theory by changing the system frequency in the master bitstream. I'll try that tomorrow.

jordens commented 5 years ago

Ack. The difference could still be due to the reduced amount of 125 MHz logic (kernel cpu and associated logic). But this is definitely at the beat. If this is PI, one might naively expect a significant "intermodulation" product on the power supplies at 25 MHz. If it is SI, then we should look at how the 125 MHz clock is getting into the 150 MHz/JESD domain. Could also be some 125 MHz hitting the DAC since its PClock is also at 150 MHz AFAIK (line rate/40). OTOH from the way the sine data is corrupted and from the almost clean 600/4/16/3 MHz periodicity of the ramp corruption it looks like it is predominantly taking "clean" data from some other/wrong group of four samples. I don't immediately see what that means but maybe EB/CDC or JESD gearbox/framing issue?

sbourdeauducq commented 5 years ago

Changed the system clock frequency using a patch similar to https://hastebin.com/nepilataje.rb, but with 21 instead of 19 (19 tickles a SDRAM bug), which yields 131.25MHz. Observed glitches (all on the same channel): image image image

enjoy-digital commented 5 years ago

I just did some tests with KCU105 / AD9154 + simple test design @10Gbps linerate + pattern from Artiq, here is what i have:

Channel0: sds00028

Channel1: sds00029

Channel2: sds00030

Channel3: sds00031

Channel 1 and 3 seem fine (83*600/1000 = 50MHz), but i was expeting to have the same pattern than with Artiq on Channel 0 and 2 but it does not seem to be the case. Otherwise, i don't seem to see the glitches.

Please tell me if you want me to do more tests.

sbourdeauducq commented 5 years ago

@enjoy-digital thanks!

sbourdeauducq commented 5 years ago

Seems resolved with Vivado 2018.3. @hartytp please confirm.