Severson-Group / AMDC-Firmware

Embedded system code (C and Verilog) which runs the AMDC Hardware
http://docs.amdc.dev/firmware
BSD 3-Clause "New" or "Revised" License
30 stars 5 forks source link

New AMDS Interface not timing out correctly #393

Open codecubepi opened 1 month ago

codecubepi commented 1 month ago

When the AMDS comes unplugged, the counters all freeze (namely the timeout counter). I believe this is because the fall on the AMDS data line is what starts the state machines and timeout counter counting. But if there is no data, the line will never fall and the SMs never start, so there is no timeout. This will also probably cause the timing manager to freeze, since the done signal for the AMDS will never be asserted.

Comment from Nathan: We need to make sure the timing manager never freezes, so probably have a user configurable time-out register. If sensors are enabled and every sensor isn't done by X time after trigger, still call the ISR, but indicate this in some error/status register

codecubepi commented 1 month ago

Update:

I made an attempt to fix this issue in commit 0d27cc0.

When unplugging the AMDS from the AMDC during operation, I could not replicate the freezing issue I initially encountered. The Timing Manager still sends out triggers, which means the AMDS driver must be timing out and asserting that it is done. However, there is still some glitchiness in the driver's timeout behaviour.

image

As seen in the image above, the AMDS will return valid data and not timeout, until it is unplugged. After unplugging, reading the debug counters gives some interesting results.

The valid counter freezes, which is expected behaviour. Both timeout counters are going, which is expected behaviour as well. However, the image shows that the data line 0 corrupt counter does not stop counting, while the data line 1 corrupt counter is correctly frozen. (in some tests, it was the data line 1 corrupt counter still running, and the data line 0 corrupt counter correctly frozen).

If a data line's corrupt counter is not frozen, that means that the state machines in adc_uart_rx.v and uart_rx.v are not remaining in the IDLE state and are instead still trying to read bytes, which further implies that somehow there is a falling edge on the data line which is starting the state machines... but the data line is unplugged and no falling edges can be seen on the scope... weird.

For now, since the "good operating condition" is working as expected, I will proceed with merging together my work with Annika's so that we can get a mosty-working pre-release of v1.3 to Anirudh. This issue can be further debugged once Anirudh is up and running. Just don't unplug the AMDS you are trying to use 🙃.

codecubepi commented 1 month ago

Also see:

codecubepi commented 2 weeks ago

A note on this issue after debugging BP4 with Anirudh... I think two additional things would be good to add:

  1. Only enabled AMDS sensors increase their timeout counters. Currently, all AMDS sensors receive the trigger signal from the timing manager in the FPGA, including those which are not enabled. Because they are not enabled (and obviously not plugged into an AMDS) they will of course timeout, and increment their counters. This is a little silly, if a given AMDS driver is not even enabled, there's not really a reason for it to increase its timeout counter.
  2. I think it would be nice to separate the timeout counter into two separate counters: one for a trigger timeout (if a trigger is received but the AMDS never responds at all, determined in adc_uart_rx.v) and a packet/byte timeout (a timeout midway through receiving data packets, determined in uart_rx.v).