Closed codecubepi closed 1 month ago
Update:
After logging some signals using the new AMDS driver firmware, I get the following graphs:
This is logging one second of data for two channels. Channel 1 is a 10 Hz sine wave and Channel 2 is a 5 Hz skewed triangle wave. Both channels appear to have flawless data. However, I am still seeing that the "data is invalid" and the corrupt counter go up.
I am running the timing manager and AMDS driver with the following parameters: trigger on both PWM high and low, with a user ratio of 4. This is effectively toggling SYNC_ADC at 50 kHz, or every 20us, or as fast as the sensor can currently be triggered,
I am noticing a few strange things when I check the channel valid register. It's value is always either 0, 15, 240, or 255. These values mean the following:
Value (Decimal) | Value (Binary) | Meaning |
---|---|---|
0 | 0000_0000 | All channels are invalid |
15 | 0000_1111 | Only channels 1-4 are valid |
240 | 1111_0000 | Only channels 5-8 are valid |
255 | 1111_1111 | All channels are valid |
These valid bits are all cleared when a new set of packets comes in, and should be being set one by one, as each of four packets looks valid. But this is pretty suspicious, to only see these statuses for valid. There are a few possible explanations:
adc_uart
state machine that the packet counter should not be at some random value (outside 0-3). If it is some random value, all the valid bits are cleared in "panic". This check could probably be removed, it's not super necessary and could be causing unnecessary panics. If nothing else, it would be a good thing to remove for a debug sanity test. Also, the width of the packet counter could be reduced from 4 bits to 2 bits so that it will always be 0-3.Regardless of the reason for this, we still have the issue with the corrupt counter. The corrupt counter is incrementing because the uart_rx
module is telling its parent adc_uart
module that either the header, data MSB, or data LSB is "corrupt", meaning it failed the parity check. But again, I don't actually see any evidence of truly corrupt data.
With help from @npetersen2 the issue has been solved! See commit 8cad1d6.
The issue was in the FPGA cycle delay caused by the hierarchical state machines in the driver. The amdc_amds_yada_AXI.v
file double-flops the incoming data lines from the AMDS for metastability, and then detects a falling edge on the flopped data line to start the SM in adc_uart_rx.v
, which handles the receving of the four packets (HEADER, MSB, and LSB).
That state machine starts the other state machine in uart_rx.v
, which receives the individual UART bytes. After this state machine is started by the parent, it looks for the falling edge of the start bit.
The problem was that the sequential delay of the start signal being flopped through the state machine hierarchy meant that the start bit had sometimes already passed, so the uart_rx
state machine was detecting a falling edge of a data bit instead, which caused a parity check fail and so on.
The solution was to flop the data line a few more times in uart_rx.v
, to guarantee that the WAIT_FOR_START_BIT
state was always entered before receiving the start bit:
Results below:
Closing with merge of #394 into the staging branch.
Testing of the new AMDS driver has been going well, save for one problem(?).
When running the timing manager in automatic triggering mode (the default), the AMDS FPGA counters report invalid data (meaning that the parity check failed). The problem seems to get worse with lower and lower user ratios. It does not seems to occur with sufficiently high user ratio or in MANUAL triggering mode.
I can't seem to actually "capture" any invalid data, so I'm not entirely convinced this is a big deal, but it might require further testing.