Closed npetersen2 closed 3 years ago
Looking at raw UART data over wire... Are packets missed there??
Here, I attached a logic analyzer to the raw UART signals between the AMDC and the motherboard.
Here is a screenshot capturing ~10 seconds of data:
...Lots of data!
If we zoom in, we can see that data is sent over the UART OUT lines every time the UART IN2 has an edge (the firmware design I chose). The AMDC toggles this line every 100us, aka each 10kHz control update, it asks for new data. This is working fine; there are no gaps when it doesn't ask for data!
If we zoom in more into one single UART packet, we can see the decoded bytes. I wrote the code to use two wires to send 8 data packets (for 8 ADC channels). The header bytes 0x90
to 0x93
encode which ADC channel is coming in, 0..3. The next two bytes are the ADC data, 16 bit ADCs. This is working great. We can see the UART is indeed running at 25Mbaud. Fast!
NOW, we can get the logic analyzer program to give us a giant list of all the data bytes it found:
Pulling this into Python, we can parse these bytes offline and recreate the resulting data stream. FYI, the FPGA is doing this parsing in real-time; we are just recreating it here in python for debugging (i.e. to see if any data is dropped within the motherboard firmware itself).
After some python magic, we get a pandas dataframe of all 98k samples from a single channel over UART, ADC channel 0.
This is the timestamp which the data appears in the logic analyzer and the voltage the bytes encoded
You can see this is working great: the times are delta of 100us like we expect.
Plotting all 98k samples gives us a big blob:
Plotting a small window (0 to 10ms) proves this is indeed the 16V pk-pk (+8V to -8V) signal:
Now, we plot the delta of the raw data from above. This should show the same kind of crap we saw at the beginning of this post if the raw data itself is missing packets (i.e. the issue is on the motherboard itself)
This plot above shows the one diff()
of the data, zoomed in to see many periods. Clearly, there are no missed packets in the data....
If we zoom out to all 98k samples, we should be able to see if there is any fuzzy outline to indicate missed packets:
And the answer is.... NO. There is a few little spurs, but basically, perfect data being sent from the motherboard over the UART wire. Therefore, the issue is NOT on the motherboard itself.
The above comment proves that the motherboard it sending the right data off of it! However, it does not prove the right data is appearing at the AMDC board.... Maybe, in hardware, the packets are getting corrupted somehow? The UART is at 25Mbaud! And it has to go through isolators and various ICs. I will check if we are getting any corrupt data packets in the UART rx on the AMDC FPGA.
The AMDC FPGA IP for the motherboard keeps track of every single corrupt byte or timeout byte from the UART rx module. It keeps a counter. After running the AMDC and SensorCard for the last ~20 mins, here is the counter values
V means valid, and you can see it keeps changing => this is good! that means new valid data is constantly coming in! C = corrupt, T = timeout, these are both all 0s. This is good, means all good data!
If I unplug the motherboard from the AMDC:
Timeout and corrupt counters change, meaning they are working!
Therefore, my conclusion is that the UART interface between the boards is working fine. All the valid packets that we saw on the motherboard hardware are appearing in the FPGA and there, the registers....
Hmmm.
Interesting. Solved it!
The issue was how I was requesting new data from the motherboard. I requested new data from the motherboard in the scheduler right before it started running all the new tasks for a given timeslice.
The issue was that, sometimes, the data wouldn't be at the AMDC yet by the time the task ran that wanted the data, therefore, it didn't have new data, thus double samples (i.e. it just used the last valid sample).
By requesting new data from the motherboard right AFTER all the tasks run for a timeslice, this gives much more time for it to arrive, thus solving the issue...
This is plot of 1 second of data (with one .diff()
) from the motherboard (previous would have had the issue):
Now, the data timing is for sure one Ts late.... We can maybe fix this by requesting data at the beginning of the controller, and then waiting until it comes in... But, this approach of being 1 Ts late is okay for now.
Just to emphasize, I recaptured that same plot above using the WRONG auto request data scheduler code (i.e it requests right before it runs the tasks).
Yucky.
And lastly, I put the code to the correct implementation and sampled 10 seconds of data from all 8 motherboard channels at 10kHz, resulting in 100k samples.
Raw samples:
Raw samples + .diff()
to show any errors:
FFT of raw data with 3 .diff()
for all channels to look at noise floor:
The noise floor with lines at sine wave harmonics (data with 3 .diff()
):
Lastly, when I turn off the function gen sine wave, here is the noise floor of 0V input (no .diff()
applied, straight raw data):
This is an FFT of voltages, so I believe the magnitude of ~10e-4 means much less than 1mV of noise ripple (?)
Closing this issue. Resolved.
@npetersen2 Glad to hear you resolved this!
[If I understand this correctly]: should we have some kind of status register in the FPGA fabric to help handle this? I am thinking of a model of what I have seen with UARTs where a status flag is asserted when new data is available and this flag is cleared every time the data is read. This could be implemented alongside a FIFO. I realize that you would still have to do your sampling at the end of your task (because you just know that you don't have knew data isn't enough to make new data arrive), but from an error detection perspective, it may be nice to know if you are working with fresh data or not.
@elsevers Yes, I agree. I actually already have a register bit in the firmware which indicates if the data is valid or not. Adding another bit like you describe above would be easy.
For high performance firmware, the idea is to use the forthcoming sys/controller.c
type which runs the user controller code in a PWM-based ISR context. At the beginning of the controller code, the user should request new data from the motherboard. Then, they should busy poll the status register to wait until the new data arrive. Only then should they proceed and run their controller. This will ensure they have the latest data and that there is no 1 Ts delay in the sampling.
For low performance firmware, the user can just turn on the "auto request data" flag and the scheduler will request data after it runs all tasks. This ensures the data is always "fresh," but will be 1 Ts late. However, the user code does not have to worry about asking for new data!
For future reference, this Jupyter notebook can be used to parse data from the AMDS that is sampled via a logic analyzer looking at the UART signal lines. This is to validate the samples being sent are valid.
This is a valid jupyter notebook! Rename it with the right extension to look at it. It is .txt
so GitHub would let me upload it...
AMDS_SampleParsing.txt
When testing the new SensorCard data interface, mostly, the data interface works great. However, every once in a while, it seems that samples are dropped somewhere.
This issue serves as a log of me trying to figure out where.
This is not that same issue as #163! However, in #163, you can see this problem clearly, as seen below (image copied from other issue) in the 1st difference of the sampled data from a function generator: