Closed hartytp closed 4 years ago
@marmeladapk / @jbqubit are you able to reproduce this?
Reproduced in a fresh conda environment with a current master, now clocking at 100MHz. Binaries are here, timing is met.
conda list
# packages in environment at /home/tph/anaconda3/envs/artiq-dev:
#
# Name Version Build Channel
aiohttp 3.1.3 py35_0 conda-forge/label/main
alabaster 0.7.12 py_0 conda-forge/label/main
artiq 4.0.dev0+1408.gd0ee2c29 <pip>
artiq-dev 4.0.dev 1408+gitd0ee2c29 m-labs/label/dev
asn1crypto 0.24.0 py35_3 conda-forge/label/main
async-timeout 2.0.1 py35_0 conda-forge/label/main
asyncserial 0.1 py_13+git340e430 m-labs/label/main
attrs 18.2.0 py_0 conda-forge/label/main
babel 2.6.0 py_1 conda-forge/label/main
binutils-or1k-linux 2.30 7 m-labs/label/main
blas 1.0 mkl
bscan-spi-bitstreams 0.10.0 2 m-labs/label/main
bzip2 1.0.6 h470a237_2 conda-forge/label/main
ca-certificates 2018.8.24 ha4d7672_0 conda-forge/label/main
certifi 2018.8.24 py35_1001 conda-forge/label/main
cffi 1.11.5 py35h5e8e0c9_1 conda-forge/label/main
chardet 3.0.4 py35_3 conda-forge/label/main
colorama 0.3.9 py_1 conda-forge/label/main
coverage 4.5.1 py35h470a237_1 conda-forge/label/main
cryptography 2.3.1 py35hdffb7b8_0 conda-forge/label/main
cryptography-vectors 2.3.1 py35_0 conda-forge/label/main
dbus 1.13.0 h3a4f0e9_0 conda-forge/label/main
docutils 0.14 py35_1 conda-forge/label/main
expat 2.2.5 hfc679d8_2 conda-forge/label/main
fontconfig 2.13.1 h65d0f4c_0 conda-forge/label/main
freetype 2.9.1 h6debe1e_4 conda-forge/label/main
gettext 0.19.8.1 h5e8e0c9_1 conda-forge/label/main
glib 2.55.0 h464dc38_2 conda-forge/label/main
gst-plugins-base 1.12.5 hde13a9d_0 conda-forge/label/main
gstreamer 1.12.5 h61a6719_0 conda-forge/label/main
h5py 2.8.0 py35h7eb728f_2 conda-forge/label/main
hdf5 1.10.2 hc401514_2 conda-forge/label/main
icu 58.2 hfc679d8_0 conda-forge/label/main
idna 2.7 py35_2 conda-forge/label/main
idna_ssl 1.0.0 0 conda-forge/label/main
imagesize 1.1.0 py_0 conda-forge/label/main
intel-openmp 2019.0 118
jesd204b 0.10 py_1 m-labs/label/main
jinja2 2.10 py_1 conda-forge/label/main
jpeg 9c h470a237_1 conda-forge/label/main
levenshtein 0.12.0 py35_1 m-labs/label/main
libcurl 7.61.0 h1ad7b7a_0
libffi 3.2.1 hfc679d8_5 conda-forge/label/main
libgcc-ng 7.2.0 hdf63c60_3 conda-forge/label/main
libgfortran 3.0.0 1 conda-forge/label/main
libgfortran-ng 7.2.0 hdf63c60_3 conda-forge/label/main
libgit2 0.24.1 7 m-labs/label/main
libiconv 1.15 h470a237_3 conda-forge/label/main
libpng 1.6.35 ha92aebf_2 conda-forge/label/main
libssh2 1.8.0 h5b517e9_2 conda-forge/label/main
libstdcxx-ng 7.2.0 hdf63c60_3 conda-forge/label/main
libusb 1.0.20 0 m-labs/label/main
libuuid 2.32.1 h470a237_2 conda-forge/label/main
libxcb 1.13 h470a237_2 conda-forge/label/main
libxml2 2.9.8 h422b904_5 conda-forge/label/main
lit 0.4.1 py_9 m-labs/label/main
llvm-or1k 6.0.0 25 m-labs/label/main
llvmlite-artiq 0.23.0.dev py35_4 m-labs/label/main
markupsafe 1.0 py35h470a237_1 conda-forge/label/main
microscope 1.3 py_1 m-labs/label/main
migen 0.7 py35_73+gitbef9dea m-labs/label/dev
misoc 0.11 py35_31+git5ce139dd m-labs/label/dev
mkl 2019.0 118
mkl_fft 1.0.6 py35_0 conda-forge/label/main
mkl_random 1.0.1 py35_0 conda-forge/label/main
msgpack-python 0.5.6 py35h2d50403_3 conda-forge/label/main
multidict 4.4.2 py35h470a237_0 conda-forge/label/main
ncurses 6.1 hfc679d8_1 conda-forge/label/main
numpy 1.15.0 py35h1b885b7_0
numpy-base 1.15.0 py35h3dfced4_0
openocd 0.10.0 6 m-labs/label/main
openssl 1.0.2p h470a237_0 conda-forge/label/main
outputcheck 0.4.2 py_7 m-labs/label/main
pcre 8.41 hfc679d8_3 conda-forge/label/main
pip 18.0 py35_1001 conda-forge/label/main
prettytable 0.7.2 py_2 conda-forge/label/main
pthread-stubs 0.4 h470a237_1 conda-forge/label/main
pycparser 2.19 py_0 conda-forge/label/main
pygit2 0.24.0 py35_4 m-labs/label/main
pygments 2.2.0 py_1 conda-forge/label/main
pyopenssl 18.0.0 py35_0 conda-forge/label/main
pyqt 5.6.0 py35h8210e8a_7 conda-forge/label/main
pyqtgraph 0.10.0 py_5 conda-forge/label/main
pyserial 3.4 py35_0 conda-forge/label/main
pysocks 1.6.8 py35_2 conda-forge/label/main
python 3.5.5 h5001a0f_2 conda-forge/label/main
python-dateutil 2.7.3 py_0 conda-forge/label/main
pythonparser 1.1 py_8 m-labs/label/main
pytz 2018.5 py_0 conda-forge/label/main
qt 5.6.2 hf70d934_9 conda-forge/label/main
quamash 0.5.5 py_4 m-labs/label/main
readline 7.0 haf1bffa_1 conda-forge/label/main
regex 2015.11.22 py35_1 m-labs/label/main
requests 2.19.1 py35_1 conda-forge/label/main
rust-core-or1k 1.28.0 21 m-labs/label/main
rustc 1.28.0 21 m-labs/label/main
scipy 1.1.0 py35hfc37229_0
setuptools 33.1.1 py35_0 conda-forge/label/main
sip 4.18.1 py35hfc679d8_0 conda-forge/label/main
six 1.11.0 py35_1 conda-forge/label/main
snowballstemmer 1.2.1 py_1 conda-forge/label/main
sphinx 1.4.8 py35_0 conda-forge/label/main
sphinx-argparse 0.1.13 py_4 m-labs/label/main
sphinx_rtd_theme 0.4.1 py_0 conda-forge/label/main
sphinxcontrib-wavedrom 1.1.0 <pip>
sphinxcontrib-wavedrom 1.1.0 py_1 m-labs/label/main
sqlite 3.25.2 hb1c47c0_0 conda-forge/label/main
system 5.8 2
tk 8.6.8 ha92aebf_0 conda-forge/label/main
urllib3 1.23 py35_1 conda-forge/label/main
wheel 0.32.0 py35_1000 conda-forge/label/main
xorg-libxau 1.0.8 h470a237_6 conda-forge/label/main
xorg-libxdmcp 1.1.2 h470a237_7 conda-forge/label/main
xz 5.2.4 h470a237_1 conda-forge/label/main
yarl 1.2.6 py35h470a237_0 conda-forge/label/main
zlib 1.2.11 h470a237_3 conda-forge/label/main
UART log:
__ __ _ ____ ____
| \/ (_) ___| ___ / ___|
| |\/| | \___ \ / _ \| |
| | | | |___) | (_) | |___
|_| |_|_|____/ \___/ \____|
MiSoC Bootloader
Copyright (c) 2017-2018 M-Labs Limited
Bootloader CRC passed
Gateware ident 4.0.dev+1408.gd0ee2c29;standalone.without-sawg
Initializing SDRAM...
DQS initial delay: 110 taps
Write leveling scan:
Module 3:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001101111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111100110110000000000000000000000000000000000000000
Module 2:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111011010000000000000000000000000
Module 1:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111001100000000000000000000000000000000000000000000000000000000000000
Module 0:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110010000000000000000000000000000000000000000000000000000000000000
DQS initial delay: 110 taps
Write leveling: 107 112 138 130 done
Read leveling scan:
Module 3:

Module 2:

Module 1:

Module 0:

Read leveling: 227+-83 215+-92 192+-84 179+-91 done
SDRAM initialized
Memory test passed
Booting from flash...
Starting firmware.
[ 0.000005s] INFO(runtime): ARTIQ runtime starting...
[ 0.003867s] INFO(runtime): software ident 4.0.dev+1408.gd0ee2c29;standalone.without-sawg
[ 0.012131s] INFO(runtime): gateware ident 4.0.dev+1408.gd0ee2c29;standalone.without-sawg
[ 0.020404s] INFO(runtime): log level set to INFO by default
[ 0.026116s] INFO(runtime): UART log level set to INFO by default
[ 0.032261s] INFO(board_artiq::slave_fpga): Loading slave FPGA gateware...
[ 0.039211s] INFO(board_artiq::slave_fpga): magic: 0x5352544d, length: 0x000bf050
[ 0.046937s] INFO(board_artiq::slave_fpga): DONE before loading
[ 1.046830s] INFO(board_artiq::slave_fpga): ...done
[ 1.050643s] INFO(board_artiq::serwb): waiting for AMC/RTM serwb bridge to be ready...
[ 1.085372s] INFO(board_artiq::serwb): ...done.
[ 1.088745s] INFO(board_artiq::serwb): RTM to AMC link test...
[ 2.571047s] INFO(board_artiq::serwb): ...passed
[ 2.574594s] INFO(board_artiq::serwb): AMC to RTM link test...
[ 4.056902s] INFO(board_artiq::serwb): ...passed
[ 4.060459s] INFO(board_artiq::serwb): Wishbone test...
[ 5.992399s] INFO(board_artiq::serwb): ...passed
[ 5.996196s] INFO(board_artiq::serwb): RTM gateware version 4.0.dev+1408.gd0ee2c29
[ 6.003589s] INFO(runtime): press 'e' to erase startup and idle kernels...
[ 7.003006s] INFO(runtime): continuing boot
[ 7.265246s] INFO(board_artiq::si5324): waiting for Si5324 lock...
[ 13.555740s] INFO(board_artiq::si5324): ...locked
[ 13.559508s] INFO(board_artiq::hmc830_7043::hmc830): loading HMC830 configuration...
[ 13.567442s] INFO(board_artiq::hmc830_7043::hmc830): ...done
[ 13.573097s] INFO(board_artiq::hmc830_7043::hmc830): setting HMC830 dividers...
[ 13.580561s] INFO(board_artiq::hmc830_7043::hmc830): ...done
[ 13.586381s] INFO(board_artiq::hmc830_7043::hmc830): waiting for HMC830 lock...
[ 13.593802s] INFO(board_artiq::hmc830_7043::hmc830): ...locked
[ 13.600032s] INFO(board_artiq::hmc830_7043::hmc7043): enabling HMC7043
[ 13.616733s] INFO(board_artiq::hmc830_7043::hmc7043): loading configuration...
[ 13.634220s] INFO(board_artiq::hmc830_7043::hmc7043): ...done
[ 13.638903s] INFO(board_artiq::hmc830_7043::hmc7043): testing GPO...
[ 13.646016s] INFO(board_artiq::hmc830_7043::hmc7043): ...passed
[ 13.662230s] INFO(board_artiq::ad9154): AD9154-0 initializing...
[ 13.674111s] INFO(board_artiq::ad9154): ...done
[ 13.747991s] INFO(board_artiq::ad9154): AD9154-0 running PRBS test...
[ 14.754355s] INFO(board_artiq::ad9154): ...passed
[ 14.758006s] INFO(board_artiq::ad9154): AD9154-0 running STPL test...
[ 14.764795s] INFO(board_artiq::ad9154): c0 errors: 0
[ 14.769998s] INFO(board_artiq::ad9154): c1 errors: 0
[ 14.775208s] INFO(board_artiq::ad9154): c2 errors: 0
[ 14.780418s] INFO(board_artiq::ad9154): c3 errors: 0
[ 14.785342s] INFO(board_artiq::ad9154): ...passed
[ 14.800325s] INFO(board_artiq::ad9154): AD9154-0 initializing...
[ 14.807742s] INFO(board_artiq::ad9154): ...done
[ 14.892231s] INFO(board_artiq::ad9154): AD9154-1 initializing...
[ 14.903958s] INFO(board_artiq::ad9154): ...done
[ 14.977827s] INFO(board_artiq::ad9154): AD9154-1 running PRBS test...
[ 15.984184s] INFO(board_artiq::ad9154): ...passed
[ 15.987833s] INFO(board_artiq::ad9154): AD9154-1 running STPL test...
[ 15.994618s] INFO(board_artiq::ad9154): c0 errors: 0
[ 15.999827s] INFO(board_artiq::ad9154): c1 errors: 0
[ 16.005037s] INFO(board_artiq::ad9154): c2 errors: 0
[ 16.010247s] INFO(board_artiq::ad9154): c3 errors: 0
[ 16.015169s] INFO(board_artiq::ad9154): ...passed
[ 16.030154s] INFO(board_artiq::ad9154): AD9154-1 initializing...
[ 16.037573s] INFO(board_artiq::ad9154): ...done
[ 16.111459s] INFO(board_artiq::jesd204sync): aligning SYSREF with RTIO...
[ 16.127028s] INFO(board_artiq::jesd204sync): ...done (0/62 slips)
[ 16.135242s] INFO(board_artiq::jesd204sync): margins at FPGA: -18 +16
[ 16.140650s] INFO(board_artiq::jesd204sync): calibrating SYSREF phase at DAC-0...
[ 16.149429s] ERROR(runtime): failed to align SYSREF at DAC: no sync lock
[ 16.154879s] INFO(board_artiq::hmc542): card 0 channel 0 set to 4 dB
[ 16.163391s] INFO(board_artiq::hmc542): card 0 channel 1 set to 4 dB
[ 16.170608s] INFO(board_artiq::hmc542): card 1 channel 0 set to 4 dB
[ 16.177825s] INFO(board_artiq::hmc542): card 1 channel 1 set to 4 dB
[ 16.185042s] INFO(board_artiq::hmc542): card 2 channel 0 set to 4 dB
[ 16.192260s] INFO(board_artiq::hmc542): card 2 channel 1 set to 4 dB
[ 16.199477s] INFO(board_artiq::hmc542): card 3 channel 0 set to 4 dB
[ 16.206694s] INFO(board_artiq::hmc542): card 3 channel 1 set to 4 dB
[ 16.213940s] WARN(runtime): using default MAC address 02-00-00-00-00-11; consider changing it
[ 16.221314s] INFO(runtime): using default IP address 192.168.1.60
[ 16.228970s] INFO(runtime::mgmt): management interface active
[ 16.242120s] INFO(runtime::session): accepting network sessions
[ 16.256063s] INFO(runtime::session): running startup kernel
[ 16.282968s] INFO(runtime::kern_hwreq): resetting RTIO
Looking at all channels on a fast scope. Using an ac-coupled TCM2-43X+ and a 50Ohm scope. Channel numbers refer to the physical ordering of the SMPs, with the ones nearest the SATA connectors being 7 (I recall the mapping between SMPs and DAC channels being somewhat screwy).
0
1
2
3
4
5
6
7
So we have these odd glitches and channels 3/7 show some odd curvature...
@jbqubit @marmeladapk @gkasprow can one of you look at your board with the binaries I posted and see if you reproduce this...
The "odd curvature" (assuming you are talking about the low frequency deviation from the sawtooth) on 1,3,5,7 is your balun AFAICT. Do they worry you? There are glitches on 0,1,4,5: periodic on 1,5 and random on 0,4.
Using @hartytp build I also see problems with the sawtooth pattern. When errors occur they don't occur with the same period as the sawtooth so I've done a single-trigger capture. sawg0...sawg7
Is there an older version where that did not happen?
Is there an older version where that did not happen?
Not sure. Are you able to test this on your hardware? Or, is there some other issue blocking that?
We could go back and check with the commit that we though fixed this. If you post binaries then maybe @jbqubit can test them (I won't be able to do that this week)? Since this issue only affects some channels, it's quite possible that we thought it was fixed when it wasn't. Or, it could be one of those PITA issues like an incorrect reset sequence or CDC or something that only causes issues with some builds.
Generally lacking time + need to fix the power supply or microTCA.
@sbourdeauducq Is there a way to get access to your MTCA crate? I mean MCH Ethernet, NAT MCH console, Sayma USB + MMC programmer + some mean to switch the power of the crate ?
Given that the behavior differs between channels (where the ramp generator does not differ), this is likely something downstream and not the ramp generator.
@hartytp to clarify what you were saying:
Is it known whether or not this happens with SAWG as well?
@jbqubit
I haven't got enough data at the moment to give a definitive answer to that, although I can retest later if necessary.
I only looked at one channel with SAWG and didn't see any evidence of this (see the shots I posted on the synchronisation issue). That same channel displayed glitches with the ramp generator.
However, without knowing the cause of this issue, it's possible that the presence/absence of glitches depends on exactly what RF waveform is produced on the various channels, so my one observation doesn't necessarily tell us much.
To frame this, when do you (or anyone) remember checking the outputs the last time? Before the synchronization/sysref work?
This is not related to the SYSREF work. @jbqubit sees this on an unmodified Sayma using the current ARTIQ master. I see it both with master and with my sync branch.
I did some tests looking at glitches shortly after the current JESD204B release (see the IRC logs). It's been a while now, so I can't recall if I verified that the glitches were absent on all channels, or if I just confirmed that after that release the channel I had been looking at no longer had glitches (i.e. I'm not sure if I can confirm whether the glitches disappeared after that release or if they just moved channels).
Once the power supply issue is fixed, it would be great to know if @sbourdeauducq can reproduce this.
@hartytp Did you try to supply other reference frequencies (100, 125 MHz?). Just as a sanity check. With @jbqubit we had very similar symptoms, which were caused by wrong reference frequency (even though hmc830 locked). Perhaps hmc830 freq isn't set properly?
It's a wild guess, but it should be quick to check it.
@hartytp Did you try to supply other reference frequencies (100, 125 MHz?). Just as a sanity check. With @jbqubit we had very similar symptoms, which were caused by wrong reference frequency (even though hmc830 locked). Perhaps hmc830 freq isn't set properly?
Good thought, but I was quite careful about this. I was also able to reproduce this issue with a variety of clocking options and reference frequencies.
Note also, that @jbqubit reproduced my measurements, so he would have to be making the same mistake as me.
The picture here looks suspicious, I don't see large glitches but that's perhaps just because of the limited scope bandwidth.
I don't know what is causing this.
Did you look at all DAC channels?
That's the next thing I'm planning to do, but I ran out of time for today. BTW, this was taken with the board inside the µTCA crate and Ethernet also works (through the crate).
BTW, this was taken with the board inside the µTCA crate and Ethernet also works (through the crate).
🎆
perhaps just because of the limited scope bandwidth.
Your scope has enough vertical and time resolution to see the glitches that I saw. Looks like both @hartytp and I saw that only a subset of DAC channels had the glitches.
https://github.com/m-labs/artiq/issues/1166#issuecomment-427388926
Digging through the issue trackers to remind myself the history of this issue:
Okay, now it won't boot anymore, panic at src/libcore/result.rs:945:5cannot load RTM FPGA gateware: "Did not assert INIT in reaction to PROGRAM"
Maybe one of the reworks broke when I added more Allaki to look at other channels...
Taking the RTM out of the crate and putting it back in again "fixed" the problem. You gotta love Sayma... There are glitches on some channels:
Cleaning this up and categorizing the historic and current issues observed/fixed, we have the following:
AFAIR since there were also significant changes to the test pattern generator (without-sawg) afterwards that could have and were intended to converted rare and low visibility glitches (like 4) into more dramatic glitches like (1).
Ergo, let's focus on (1) first and ignore the rest for now.
Since we do not suspect the ramp generator, we should look at JESD.
Thanks for looking into this @sbourdeauducq and @jordens. Glad to hear you can reproduce the issue. If there are any specific tests you'd like me to do then let me know.
The problem is not present with the DRTIO satellite target (@hartytp please confirm). So this looks like a clocking gateware/firmware bug, or more HMC7043 shenanigans (with DRTIO satellite, the JESD transceivers are clocked by the Si5324 and not the HMC).
To test, compile the DRTIO satellite with the --without-sawg
flag and then provide a DRTIO uplink on the SFP (with 150MHz RTIO clock, SFP0 labeled "Cage1" on the PCB) which can come from a Kasli or another Sayma acting as DRTIO master. Just establishing the DRTIO link is sufficient, no command has to be sent.
Ha. The glitches are every 40 ns. That's an even 6 coarse RTIO periods and 5 periods of 125 MHz. That's why they look random on the 600/4/16 MHz ramp and periodic on the 600/4/3 MHz sine. There isn't that much (anything?) driven at 125 MHz in DRTIO satellites, but quite a bit in master bitstreams. This could be some fabric or external crosstalk due to the beat and thus PI/SI issues. Or a gateware/firmware bug that's always there and only exposed by the beat.
The satellite has the CPU with satman + SDRAM at 125MHz.
We can test this theory by changing the system frequency in the master bitstream. I'll try that tomorrow.
Ack. The difference could still be due to the reduced amount of 125 MHz logic (kernel cpu and associated logic). But this is definitely at the beat. If this is PI, one might naively expect a significant "intermodulation" product on the power supplies at 25 MHz. If it is SI, then we should look at how the 125 MHz clock is getting into the 150 MHz/JESD domain. Could also be some 125 MHz hitting the DAC since its PClock is also at 150 MHz AFAIK (line rate/40). OTOH from the way the sine data is corrupted and from the almost clean 600/4/16/3 MHz periodicity of the ramp corruption it looks like it is predominantly taking "clean" data from some other/wrong group of four samples. I don't immediately see what that means but maybe EB/CDC or JESD gearbox/framing issue?
Changed the system clock frequency using a patch similar to https://hastebin.com/nepilataje.rb, but with 21 instead of 19 (19 tickles a SDRAM bug), which yields 131.25MHz. Observed glitches (all on the same channel):
I just did some tests with KCU105 / AD9154 + simple test design @10Gbps linerate + pattern from Artiq, here is what i have:
Channel0:
Channel1:
Channel2:
Channel3:
Channel 1 and 3 seem fine (83*600/1000 = 50MHz), but i was expeting to have the same pattern than with Artiq on Channel 0 and 2 but it does not seem to be the case. Otherwise, i don't seem to see the glitches.
Please tell me if you want me to do more tests.
@enjoy-digital thanks!
Seems resolved with Vivado 2018.3. @hartytp please confirm.
Building in a fresh artiq-dev=4.0.dev conda environment with the latest master. Only change is to set the HMC830 reference frequency to 150MHz in the Sayma AMC target. Clocking at 150MHz from a synth.
Building without sawg, I see glitches on the ramp generator's output
Some channels look better than others, but all have some level of glitch. This could well be related to the synchronisation issues I see...