[spi] fix preload issue

This is ~partial~ an RCA on SPI issues:

Screenshot from 2021-04-07 23-11-41

Note we are sending the following sequence: {0x02, 0x00, 0x01}, but three CIPO pulses are present. The erroneous pulse occurs due to signalling complete transfer at the rising edge (when we read). Since the FSM is only a cycle or two, we end up loading the next word half way through sending the last bit (generally in Mode 0 Transfer CIPO state exist for a whole period, whereas receive is during the high half of a clock period). By waiting until the falling edge to signal a complete transfer we correct our mode 0 behavior. This does not completely address SPI issues, but it has put the errors into two main categories: True and NACK. Before we would have randomly wrong reads in the last bit of a byte due to the wrong behavior.

Update:

The first fix revealed that we have metastability somewhere, since we were reading + writing correctly. This was validated by readback of the received word. This showed the system was reading the sent word + header correctly, was was not entering the FSM correctly. Initial throught was that the FSM must be metastable. After a refactor to make the machine "one hot" this did not fix. After closer investigation of the signals the only potentially metastable signal was the chip select - CS. Sure enough, by clocking this signal in error rates went to zero.

To be completely frank I do not understand yet how this would propagate though the logic, because as I see it now, this should not be the case. The most likely effect would be rx_byte_ready_r inside spi violating setup times. Regardless, the lesson learned in every input signal should be clocked in + buffered.

I am running extended tests with these patches. A 30 minute tests has shown no errors at 1Mhz. I will run a multi-hour test over night for additional validation.

RAPcores / rapcores

[spi] fix preload issue #193