mbed-ce / mbed-os

Arm Mbed OS is a platform operating system designed for the internet of things
https://mbed.com
Other
56 stars 12 forks source link

Clock line of SPIF is dipping when you use manual chip select and you mounted an SD #226

Open lefebvresam opened 5 months ago

lefebvresam commented 5 months ago

When you mount/unmount an SD with following config:

"sd.SPI_CS": "SD_CS",
"sd.SPI_MOSI": "SD_MOSI",
"sd.SPI_MISO": "SD_MISO",
"sd.SPI_CLK": "SD_CLK",
"sd.TRX_FREQUENCY": 40000000,
/**** SD card pins ****/ //SPI2
SD_MOSI = PB_15,
SD_MISO = PB_14,
SD_CLK = PB_13,
SD_CS = PB_12,
SDBlockDevice sd;
FATFileSystem fs("fs");
fs.mount(&sd);
fs.unmount();

And you use SPIF with this config:

"spif-driver.SPI_CS": "SPIF_CS",
"spif-driver.SPI_MOSI": "SPIF_MOSI",
"spif-driver.SPI_MISO": "SPIF_MISO",
"spif-driver.SPI_CLK": "SPIF_CLK",            
"spif-driver.SPI_FREQ": 40000000   
/**** SPIF pins ****/ //SPI3
SPIF_MOSI = PC_12,
SPIF_MISO = PC_11,
SPIF_CLK = PC_10,
SPIF_CS = PA_15_ALT0,

And you use legacy code which is not using HW chipselect:

_select(true); // set CS 0
_spiwrite(0x80);            // RS:1 (Cmd/Status), RW:0 (Write)
_spiwrite(command);
if (data <= 0xFF) {   // only if in the valid range
    _spiwrite(0x00);
    _spiwrite(data);
}
_select(false); // set CS 1

During the first SPI cycle there is a unwanted dip in the clock which misconfigures the slave device. This has as result that all the following data that you want to read will be read as zero.

New15

If you remove the lines:

//  fs.mount(&sd);
//  fs.unmount();

Then you have a correct start cycle and the SPI slave is readout correctly:

New16

multiplemonomials commented 5 months ago

Hmm, this could be the classic issue: when you do the first transfer to an SPI device, if the SPI bus had been in a different mode, it will change the mode (e.g. mode 3 to mode 0), which can cause the clock line to change state (e.g. high to low). This is why we have that warning in the SPI docs about not using a GPIO to control the chip select -- it's an easy way to get yourself in trouble because the chip might get selected before the clock line is ready.

Does the issue still happen if you change the code to pass the chip select pin to SPI as a gpio ssel pin? The SD card driver built in to Mbed does support and use that mode.

lefebvresam commented 5 months ago

To clarify this issue is on the SPIF and not on the SD. But let me do a test to use the HW chipselect. I'm convinced that the best practice is to transfer the cs control to the SPI module and use 16 bit transfers instead of two times 8 bit transfers. But the point is that a lot of legacy libs on the internet still use SW chipselect and are not optimal. What worked before is for that reason no longer working and It could give a lot of poeple unwanted stress. So in the first place a dip on the clock should not happen when you switch mode. But indead if you switch from active high clock to active low clock this is almost insurmountable. The switch should be done before the SW cs and you have to init again before using SPI SW line. The strange thing is that the SD SPI and the SPIF are on different outputs, so how they can influence each other?

multiplemonomials commented 5 months ago

Hmm, yeah, if there is only one device on the SPI peripheral, one wouldn't expect a glitch like this to happen each time. Are you sure that the SD and SPIF are on different SPI pins and different SPI peripherals (e.g. SPI2 vs SPI3)?

lefebvresam commented 5 months ago

I suppose SPI selection is automatically done based on pin assignment. I use these pins in the instatiations:

    /**** SD card pins ****/ //SPI2
    SD_SEL = PC_6,
    SD_MOSI = PB_15,
    SD_MISO = PB_14,
    SD_CLK = PB_13,
    SD_CS = PB_12,

    /**** SPIF pins ****/ //SPI3
    SPIF_SEL = PD_2,
    SPIF_MOSI = PC_12,
    SPIF_MISO = PC_11,
    SPIF_CLK = PC_10,
    SPIF_CS = PA_15_ALT0,
lefebvresam commented 4 months ago

Because I had this issue another time I did some further investigation of it to find the root cause of the dips in the clock line. Having these dips invokes writing errors to the graphical chip and sometimes widgets not rendering at the right location.

I have the following:

As soon as the construction of an SPI object is used or derefered, there is an 'aquiring' mechanism to share the same SPI resource between multiple instances.

I commented out a lot of other calls to isolate the issue:

In _acquire SPI.cpp I put std::printf("o: %u t:%u", (int)_peripheral->owner, (int)this); The result is that the same entry in the list SPI::spi_peripheral_s is reused with _peripheral_name = GlobalSPI;

construct ra8875
o:0 t:536889908 // (call _acquire)

construct dmaflash

construct sdblock
lcd init
o:536889908 t:536890980 // (call _acquire)
sd init
o:536890980 t:536889776 // (call _acquire)
lcd data
o:536889776 t:536890980 // (call _acquire) -> here I see the dip in the clockline on the scope
o:536890980 t:536890980 // (call _acquire)

In _acuire() there is a context switch where the init is called which fills in the current pins etc and a reconfigure is happening with spi_format() and spi_frequency(). Some registers are written there which invokes transision effects on the clock line.

Next to the solution that using the ssel line as a parameter input during construction of SPI, where the ccsel is pulled low after those context switches, there is another bug I encountered.

The list is filled up with SPI devices selected from the pinmap (spi_get_peripheral_name) but only if DEVICE_SPI_COUNT is defined.

    // Need backwards compatibility with HALs not providing API
#ifdef DEVICE_SPI_COUNT
    _peripheral_name = spi_get_peripheral_name(_mosi, _miso, _sclk);
#else
    _peripheral_name = GlobalSPI;
#endif

In PeripheralNames.h of TARGET_STM32U5 this is not defined.

If I put this:

#define DEVICE_SPI_COUNT 3 // -> this is inserted
typedef enum {
    SPI_1 = (int)SPI1_BASE,
    SPI_2 = (int)SPI2_BASE,
    SPI_3 = (int)SPI3_BASE
} SPIName;

The dip is also gone because this context switch is not there anymore because different HW SPI blocks are now allocated for both interfaces and context switch chances between sd SPI and own SPI are reduced.

So I propose to create a PR for adding this line.

multiplemonomials commented 4 months ago

Nice debugging! Thank you for tracking this down! Hmm, I do have #179 open for another piece of this, which is that it's kind of dumb for SPI::_acquire() to call init() in the first place -- we should make a separate function than init which does pin muxing only. But this is definitely its own issue. I wonder how many devices have a similar problem.

multiplemonomials commented 4 months ago

OK I did a random sample of some targets, and seems like at least half the current set of target devices don't implement spi_get_peripheral_name() and don't define DEVICE_SPI_COUNT. So it seems like this is a separate issue, that a lot of the MCU vendors didn't implement this function (likely because the test suite didn't generate any errors if they didn't!),

However, this situation is a bit different, because STM does implement spi_get_peripheral_name() (original PR), but someone simply forgot to define it for STM32U5 and STM32L5. @lefebvresam Please do create a PR that adds this define for U5 and L5!

I have also opened #255 to look into this later.