Closed JAndrassy closed 10 months ago
Hello @JAndrassy ,
Explanation of pins: I will attach a doc which explains further these pins in short while. Basically, Handshake denotes the readiness of slave for next transaction. If handshake is high, then only the Linux (being master of spi bus), will start spi transaction (if either party has buffer to transfer). Idle state for handshake is high. Will be hold low, when slave is still populating buffer. Linux is programmed to identify this level change by positive edge trigger interrupt. Also additionally positive edge trigger for data ready.
As you had pointed: Handshake is additionally changed from high to low when CS is asserted. This change is a fix to rare race condition on bus.
Race condition: slave is controls the Handshake pin. But as Linux is so fast, by the time complete high to low Transition happens, Linux identify the state transition and immediately triggers next transaction. Due to this rare unexpected transaction, spi transaction starts at host, but slave starts in middle of the transaction, making some garbage buffer sent to host and garbage packet received from host.
To avoid this race condition , we get handshake low as soon as slave identify the CS is tradition from high to low (for active low CS). Basically motive to start the transaction when handshake is high(slave activeness for next transaction) is served. So no issues to get handshake down till transaction is complete and slave buffer is made ready for next transaction.
Low level for handshake is transient state , which is small time when slave is yet getting buffer from queue and pushing to spi device driver.
Similarly low is idle state for data ready. Is only high when slave has any new buffer to be sent to host.
Handshake is for synchronisation. Data ready is async, as host will also start spi transaction on data ready posedge trigger. The reason for data ready is indirect way to trigger the spi transaction (through host)
Nevertheless, you stand right, we should update the documentation for this new understanding.
I hope, it is clear now. Do let us know should you have any doubts.
I use with an MCU. On logic analyzer I only see the handshake pin exactly copy the state of the CS pin. doesn't the binding to CS pin suppress any other change state of the pin?
ok. I commented out the register_hs_disable_pin
in spi_slave_api.c and now I see short LOW pulses on handshake, but I see that they are on a position where they would merge with the CS driven LOW
the driver I use was written by Arduino. it evaluates only the ready pin. and the esp-hosted fw always replays with a dummy message (FF and zeros) first. then the ready pin is still HIGH and the next transaction returns the message. but it doesn't look like there is a better option to do it when the HS pin goes low with the CS pin.
on the picture the short pulse in the 4th row is the FF. and it is because HS was not yet LOW. the picture is from the version without register_hs_disable_pin
and the driver doesn't use HS.
the driver I use was written by Arduino. it evaluates only the ready pin
I'm not aware of any such implementation. The first slave to host transaction may be dummy data.
Why? Slave always have to make sure that it is ready for any upcoming transaction. So when there is no tx data, slave has to populate at least dummy data. If the frequency of data is less, you will always see the first transaction is dummy and next immediately next transaction carries data and then data ready becomes low ( if slave still doesn't have any pending data).
But, if you run throughout etc, you will see the slave always have populated buffer and no dummy buffer. So for short while it is not problem, considering the positive side that bus becomes idle when there is no data either side.
sorry, I don't complain about the Arduino driver here. I just explain what I have.
it is working, but I wasn't sure if it is as intended. thank you for the explanation.
If you can point me the implementation, I can take a look and compare.
Actually, while evolving we had similar very old design what you described . Then at the same time you had mentioned the hs disable pin, which is latest optimisation. But that makes me confused and curious at the same time, if anything better, we can definitely have a look
But anyway, the current master design is extremely stable and evolved from many issues. Our design considerations include lower power use by idling spi, whenever not required. And fastest possible full duplex transfers with stable network performance.
it has a form of Arduino library, but it uses Renesas SDK so it runs only on specific Arduinos: https://github.com/arduino/ArduinoCore-renesas/tree/main/libraries/ESPhost/src
my version which is a clean Arduino version should run on any ARM Arduino. I tested with PI Pico. https://github.com/JAndrassy/ESPHost
there is a test sketch in https://github.com/JAndrassy/ESPHost/blob/main/examples/ESP32_TEST/ESP32_TEST.ino
Sorry forgot to mention, you have these waveforms captured. Can you send me raw format? And possibly software to open, it will be easier to look and understand what you are saying.
the software is Saleae Logic https://www.saleae.com/downloads/ I use it with a cheap no-name analyzer hardware.
The captured signals are attached in zip hosted-fg-hs-cs.sal is for unmodified fw hosted-fg-hs-ignored.sal is for fw with register_hs_disable_pin commented out in esp_spi_init in spi_slave_api.c
https://github.com/espressif/esp-hosted/issues/308#issuecomment-1874059336 I didn't go through the full code yet, may take some time to understand fully. Nevertheless, I had checked your waveforms.
The SPI Clock is not stable from the waveforms you had shared.
Possibility 1) I am not sure if you had used sufficiently high sampling frequency or not? Possibility 2) If the sampling is correct, You might need to work on getting the timings correct.
Are the SPI modes used at both place, ESP and Host the same? Need to check if MOSI and MISO transitions are abiding the clock boundaries, if not, which will end up in incorrect parsing of messages. Also, incorrect message type and/or len understanding may lead to corruption or assert due to low memory. In ESP-Hosted, we do filter such rubbish/garbage packets and print them, to let user know something is not right at either side. I did not come across in repo you pointed, if you wish you can add additionally for safeguard.
Very similar comments. In general your lags are high in handshake high and clock low, I did not see the race condition issue in your both waveforms. But to repeat myself, this is rare scenario and not generally occurring. because of race condition, packets get occasionally drop/ignored in background leading to unstable system or occasional crashes.
Some old waveform I had collected that time, where we introduced extra GPIO 'DISABLE_HS' to handle the this race. disable_hs_ext_gpio.zip Later we optimised further to re-use CS as interrupt in slave, instead of additional 'DISABLE_HS' GPIO.
Some reference waveform where this isse was captured: [Uploading race_condition_pkt_capture_pkt_739e_94s83ms900us9us.sal.zip…]()
In my opinion, you have yet not reached yet to the point, to take decision of (HS disable on CS assert) or (not). First you have to check why your host clock is not stable (is it capturing settings issue or real clock issue)
irrespective of ESP-Hosted solution you use.
the SPI frequency is set to 10 MHz for the classic esp32. the logic analyzer sampling frequency was set to 8 MHz so it can't sample the SCK properly.
yes. Ideally, you should have sampling freq >= actual clock * 4
If your analyser doesn't support, you can also lower the freq to say 2MHz so that 8MHz sampling could be fine
any updates?
sorry no. It works good enough for me right now. I used the login analyzer only to see the big picture. I have the content of the SPI messages in debug print output so I don't need the analyzer to decode SPI. And in the debug print of the messages only the dummy message (FF 00 ...) was something I didn't understand.
Great... It was indeed good interaction. Thanks.
the fg spi protocol doc has
and
and
but in the firmware
register_hs_disable_pin
binds the HS pin to slave select pin and nothing else happens with the HS pin. it always copies the slave select pin so it is always HIGH when checked.