drivers: spi: spi_mcux_lpspi: inconsistent chip select behaviour

henrikbrixandersen commented 5 years ago

Describe the bug The MCUX LPSPI SPI driver handles SPI chip selects inconsistently when comparing GPIO CS and "native" controller CS handling over multipart transfers.

With GPIO CS enabled (where the LPSPI controller does not control the CS line), the CS line is kept asserted through the entire transfer. This is opposed to what happens when the CS line is controlled by the LPSPI controller itself; then the CS line is deasserted between the different parts of the transfer.

To Reproduce

#include <zephyr.h>
#include <misc/printk.h>
#include <spi.h>

#define SPI_LABEL     "SPI_1"
#define GPIO_CS_LABEL "GPIO_0"
#define GPIO_CS_PIN   16

void main(void)
{
    struct spi_config spi_cfg;
    struct device *spi_dev;
    struct spi_cs_control spi_cs;
    u8_t cmd[2] = { 0xaa, 0x55 }; /* 2 bytes command */
    u8_t response[2];             /* 2 bytes response */

    const struct spi_buf tx_bufs[] = {
        {
            .buf = cmd,
            .len = sizeof(cmd),
        },
    };
    const struct spi_buf_set tx = {
        .buffers = tx_bufs,
        .count = ARRAY_SIZE(tx_bufs),
    };
    const struct spi_buf rx_bufs[] = {
        {
            .buf = NULL,
            .len = sizeof(cmd), /* 2 dummy bytes */
        },
        {
            .buf = response,
            .len = sizeof(response),
        },
    };
    const struct spi_buf_set rx = {
        .buffers = rx_bufs,
        .count = ARRAY_SIZE(rx_bufs),
    };

    spi_dev = device_get_binding(SPI_LABEL);
    if (!spi_dev) {
        printk("SPI device not found\n");
        return;
    }

    spi_cfg.operation = SPI_OP_MODE_MASTER | SPI_TRANSFER_MSB |
        SPI_WORD_SET(8);
    spi_cfg.frequency = 1000000;
    spi_cfg.slave = 2;

#if 0
    spi_cs.gpio_dev = device_get_binding(GPIO_CS_LABEL);
    if (!spi_cs.gpio_dev) {
        printk("SPI GPIO CS device not found\n");
        return;
    }
    spi_cs.gpio_pin = GPIO_CS_PIN;
    spi_cs.delay = 1; /* us */
    spi_cfg.cs = &spi_cs;
#else
    spi_cfg.cs = NULL;
#endif

    spi_transceive(spi_dev, &spi_cfg, &tx, &rx);
}

Expected behavior The CS line remains asserted through all the parts of the multipart transfer.

Impact Deasserting the CS line in the middle of a transfer causes problem e.g. when communicating with SPI EEPROMs.

Screenshots or console output When using GPIO CS:

When using LPSPI CS:

Environment (please complete the following information):

OS: GNU/Linux
Toolchain Zephyr SDK
Commit SHA or Version used: c88c91942d984e4c2ad2b3484576d39c2538858d

Additional context This was spotted when trying to use a SPI EEPROM with the TWR-KE18F board, but it is not limited to that board nor to the KE1xF SoC series, as far as I can tell.

ioannisg commented 5 years ago

@MaureenHelm is this issue being looked at?

MaureenHelm commented 5 years ago

@MaureenHelm is this issue being looked at?

Not yet, but going to try to look at it tomorrow.

MaureenHelm commented 5 years ago

This is caused by the underlying MCUX SDK LPSPI driver missing a feature to hold the chip select active after a transfer. The underlying MCUX SDK DSPI driver used on frdm_k64f has a kDSPI_MasterActiveAfterTransfer flag that we need in the LPSPI driver used on KE1xF and i.MX RT10xx. I will file an internal NXP ticket to add this feature.

Decreasing the priority of this bug from medium to low since we can work around it with GPIO CS.

henrikbrixandersen commented 4 years ago

This is still an issue with the most recent MCUX SDK (I was just bitten by this issue on bringing up a newly developed board today). Should we change all affected in-tree boards to use cs-gpios until this is resolved?

agansari commented 4 years ago

@henrikbrixandersen i've encountered similar issues with SPI on LPC55xxx, see my 2 pulls:

26213 using GPIO CS rather than the device's slave select
26371 a fix for SS device remaining low until frame ends, undoes parts of the previous pull

pdgendt commented 1 year ago

Re-opening after discord discussion, as this issue is still relevant.

aedancullen commented 1 year ago

If CONFIG_SPI_MCUX_FLEXCOMM_DMA=y, this issue also appears with the spi_mcux_flexcomm driver.

(If CONFIG_SPI_MCUX_FLEXCOMM_DMA is not enabled, then spi_mcux_flexcomm has correct CS behavior regardless of whether GPIO or hardware CS is used.)

github-actions[bot] commented 1 year ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] commented 10 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

github-actions[bot] commented 8 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

decsny commented 7 months ago

I produced this code to try to reproduce the issue on mimxrt1024_evk on commit b573f447f04f042f427fafd23fb8cd4cacfe6e30 based on what was originally posted by @henrikbrixandersen :

#include <zephyr/kernel.h>
#include <stdio.h>
#include <zephyr/drivers/spi.h>
#include <zephyr/drivers/gpio.h>
#include <zephyr/sys/util.h>

struct spi_config spi_cfg = SPI_CONFIG_DT(DT_NODELABEL(dummy_spi_dev),
        (SPI_OP_MODE_MASTER | SPI_TRANSFER_MSB | SPI_WORD_SET(8)), 1);

const struct device *spi_ctlr = DEVICE_DT_GET(DT_NODELABEL(lpspi1));

#define XFER_SIZE 2

uint8_t cmd[XFER_SIZE] = {0xaa, 0x55};
uint8_t resp[XFER_SIZE];

const struct spi_buf tx_bufs[] = {
    {
        .buf = cmd,
        .len = sizeof(cmd),
    },
};

const struct spi_buf_set tx = {
    .buffers = tx_bufs,
    .count = ARRAY_SIZE(tx_bufs),
};

const struct spi_buf rx_bufs[] = {
    {
        .buf = NULL,
        .len = sizeof(cmd),
    },
    {
        .buf = resp,
        .len = sizeof(resp),
    },
};

const struct spi_buf_set rx = {
    .buffers = rx_bufs,
    .count = ARRAY_SIZE(rx_bufs),
};

int main(void)
{
    if (!spi_ctlr) {
        printf("SPI device not found");
        return 1;
    };

    if (!spi_cfg.cs.gpio.port) {
        printf("Using native CS\n");
    } else {
        printf("Using GPIO CS\n");
    }

    spi_transceive(spi_ctlr, &spi_cfg, &tx, &rx);

    printf("Transmission finished.\n");

    return 0;
}

With DT overlay

#include <zephyr/dt-bindings/gpio/gpio.h>
#include <zephyr/dt-bindings/spi/spi.h>

&pinmux_lpspi1 {
        group0 {
            pinmux =
                <&iomuxc_gpio_ad_b0_11_gpio1_io11>,
                <&iomuxc_gpio_ad_b0_10_lpspi1_sck>,
                <&iomuxc_gpio_ad_b0_11_lpspi1_pcs0>,
                <&iomuxc_gpio_ad_b0_12_lpspi1_sdo>,
                <&iomuxc_gpio_ad_b0_13_lpspi1_sdi>;
            drive-strength = "r0-6";
            slew-rate = "slow";
            nxp,speed = "100-mhz";
        };
};

&lpspi1 {
    cs-gpios = <&gpio1 11 GPIO_ACTIVE_LOW>;
    dummy_spi_dev: dummy_spi_dev@0 {
        compatible = "spi-device";
        reg = <0>;
        duplex = <SPI_FULL_DUPLEX>;
        spi-max-frequency = <100000>;
    };
};

And I observed the behavior to be the same whether or not the cs-gpios property was commented out of the DT or not. The behavior of both versions being that the CS is deasserted between the parts of the transfer.

Please someone who had this issue please confirm if I missed something in this code or if the behavior did indeed change in the last 5 years, or maybe the issue is specific to only some platforms

henrikbrixandersen commented 7 months ago

Please someone who had this issue please confirm if I missed something in this code or if the behavior did indeed change in the last 5 years, or maybe the issue is specific to only some platforms

I can try to reproduce this when I return from Seattle. My original finding was on the twr_ke18f board.

decsny commented 6 months ago

@henrikbrixandersen to be clear, I am finding the wrong behavior now with both gpios and native

decsny commented 5 months ago

@henrikbrixandersen to be clear, I am finding the wrong behavior now with both gpios and native

... sorry about this, I just returned to this in the spirit of LTS bug fixing and realized I accidentally left the pin muxed to the native CS before when I tested the GPIO (both mux options were in the overlay), now I see the difference, woops

decsny commented 5 months ago

@henrikbrixandersen @pdgendt I found the cause of the problem, but fixing it requires relatively major changes, can you describe the reasons why do you want to use the native chip select instead of GPIO so we can determine priority of this

henrikbrixandersen commented 5 months ago

@henrikbrixandersen @pdgendt I found the cause of the problem, but fixing it requires relatively major changes, can you describe the reasons why do you want to use the native chip select instead of GPIO so we can determine priority of this

We have been using GPIO CS until now because of this issue, but they result in an increased latency compared to using the controller provided CS lines.

decsny commented 5 months ago

@henrikbrixandersen @pdgendt I found the cause of the problem, but fixing it requires relatively major changes, can you describe the reasons why do you want to use the native chip select instead of GPIO so we can determine priority of this

We have been using GPIO CS until now because of this issue, but they result in an increased latency compared to using the controller provided CS lines.

I am assuming you are using the synchronous API, so by latency do you mean the transceive function cpu time is significantly longer because of the need to control GPIO CS at the beginning and end of the transfer?

henrikbrixandersen commented 5 months ago

I am assuming you are using the synchronous API, so by latency do you mean the transceive function cpu time is significantly longer because of the need to control GPIO CS at the beginning and end of the transfer?

Yes. The controlling the GPIOs at each end of the transaction takes up a significant portion of the total transaction time.

We've learnt to work around it. It would be nice to see it fixed, but it no longer has high priority for us.

decsny commented 5 months ago

I am assuming you are using the synchronous API, so by latency do you mean the transceive function cpu time is significantly longer because of the need to control GPIO CS at the beginning and end of the transfer?

Yes. The controlling the GPIOs at each end of the transaction takes up a significant portion of the total transaction time.

We've learnt to work around it. It would be nice to see it fixed, but it no longer has high priority for us.

Okay, one more question for you, as I am trying to figure out what the behaviour is supposed to be according to the (relatively undocumented) zephyr API: would you expect the chip select to deassert at the end of all the buffers passed to spi_transceivebeing clocked in/out, or after clocking out the the number of bits specified in the data frame size of the struct spi_config . operationflag?

henrikbrixandersen commented 5 months ago

Okay, one more question for you, as I am trying to figure out what the behaviour is supposed to be according to the (relatively undocumented) zephyr API: would you expect the chip select to deassert at the end of all the buffers passed to spi_transceivebeing clocked in/out, or after clocking out the the number of bits specified in the data frame size of the struct spi_config . operationflag?

I would expect the chip select to be deasserted only after all buffers in a transaction has been clocked in/out.

decsny commented 5 months ago

Okay, one more question for you, as I am trying to figure out what the behaviour is supposed to be according to the (relatively undocumented) zephyr API: would you expect the chip select to deassert at the end of all the buffers passed to spi_transceivebeing clocked in/out, or after clocking out the the number of bits specified in the data frame size of the struct spi_config . operationflag?

I would expect the chip select to be deasserted only after all buffers in a transaction has been clocked in/out.

Okay, but if we take that to be the contract of the zephyr spi_transceive api, then I am wondering what is the purpose of the data frame size in the operation field of the spi_config.

To me it seems like the proper use of the api is to set the SPI_HOLD_ON_CS bit in the operation field of the spi_config struct, then release it with the spi_release api. I think otherwise it seems like the chip select delimits the amount of bits specified in the frame size. With the hold_on_cs bit set, then I guess as far as I can see the size of the frame is only seen to determine the appropriate return value in spi_transceive? Maybe @tbursztyka can help explain to me what is the expected behavior here. Either way the lpspi driver is currently implemented wrong.

henrikbrixandersen commented 5 months ago

Hold on CS is for keeping the CS line asserted after the transaction (or across multiple transactions).

tbursztyka commented 5 months ago

Okay, one more question for you, as I am trying to figure out what the behaviour is supposed to be according to the (relatively undocumented) zephyr API: would you expect the chip select to deassert at the end of all the buffers passed to spi_transceivebeing clocked in/out, or after clocking out the the number of bits specified in the data frame size of the struct spi_config . operationflag?

I would expect the chip select to be deasserted only after all buffers in a transaction has been clocked in/out.

Okay, but if we take that to be the contract of the zephyr spi_transceive api, then I am wondering what is the purpose of the data frame size in the operation field of the spi_config.

To me it seems like the proper use of the api is to set the SPI_HOLD_ON_CS bit in the operation field of the spi_config struct, then release it with the spi_release api. I think otherwise it seems like the chip select delimits the amount of bits specified in the frame size. With the hold_on_cs bit set, then I guess as far as I can see the size of the frame is only seen to determine the appropriate return value in spi_transceive? Maybe @tbursztyka can help explain to me what is the expected behavior here. Either way the lpspi driver is currently implemented wrong.

As @henrikbrixandersen mentioned, CS line has to be asserted for the whole transaction. The buffers given to spi_transceive() are scatter-gather type, so the overall anyway represent one and only one transaction.

(The data frame size name in the documentation relates to the word size. This is mandatory to know how to interpret the buffers and how to r/w to/from the controller. The SPI_HOLD_ON_CS is a specific configuration bit where you can request the CS line to stay asserted after the spi_transceive() call. There were a few use case supporting that feature back then.)

decsny commented 5 months ago

Okay, one more question for you, as I am trying to figure out what the behaviour is supposed to be according to the (relatively undocumented) zephyr API: would you expect the chip select to deassert at the end of all the buffers passed to spi_transceivebeing clocked in/out, or after clocking out the the number of bits specified in the data frame size of the struct spi_config . operationflag?

I would expect the chip select to be deasserted only after all buffers in a transaction has been clocked in/out.

Okay, but if we take that to be the contract of the zephyr spi_transceive api, then I am wondering what is the purpose of the data frame size in the operation field of the spi_config. To me it seems like the proper use of the api is to set the SPI_HOLD_ON_CS bit in the operation field of the spi_config struct, then release it with the spi_release api. I think otherwise it seems like the chip select delimits the amount of bits specified in the frame size. With the hold_on_cs bit set, then I guess as far as I can see the size of the frame is only seen to determine the appropriate return value in spi_transceive? Maybe @tbursztyka can help explain to me what is the expected behavior here. Either way the lpspi driver is currently implemented wrong.

As @henrikbrixandersen mentioned, CS line has to be asserted for the whole transaction. The buffers given to spi_transceive() are scatter-gather type, so the overall anyway represent one and only one transaction.

(The data frame size name in the documentation relates to the word size. This is mandatory to know how to interpret the buffers and how to r/w to/from the controller. The SPI_HOLD_ON_CS is a specific configuration bit where you can request the CS line to stay asserted after the spi_transceive() call. There were a few use case supporting that feature back then.)

Okay, thanks, I think I was confused because in the LPSPI hardware, a word, frame, and transfer are 3 different things, then the NXP SDK has a different meaning of what a "transfer" is, and then the zephyr API has a different definition for these things as well than those. In the case of the LPSPI hardware, a "word" is how wide the writes to the transmit register are, whereas a "frame" consists of anywhere between 8-4K bits and therefore potentially multiple writes. I see the maximum data frame size in zephyr is 64 bits, is this because it's expected the register width will be up to 64 bits on some platforms? My question basically is, when I implement this driver, based on what I described, would it make sense to correlate the zephyr frame size to the LPSPI word size, and choose the lpspi "frame" size to be whatever is most convenient? Does the data frame size in the zephyr API impose/imply any structure on the contents of the spi buffers?

BTW I reread some of the comments in the header around the HOLD_ON_CS bit and what you are saying about it does make sense now when I read it like that. Again, I have been swimming in competing definitions of the same terms, so I was slightly confused at first.

tbursztyka commented 5 months ago

I see the maximum data frame size in zephyr is 64 bits, is this because it's expected the register width will be up to 64 bits on some platforms? My question basically is, when I implement this driver, based on what I described, would it make sense to correlate the zephyr frame size to the LPSPI word size, and choose the lpspi "frame" size to be whatever is most convenient? Does the data frame size in the zephyr API impose/imply any structure on the contents of the spi buffers?

The dfs size is all about the spi device you are dealing with and what the controller can support. When this API was designed, there were no identified devices or controllers able to deal with more than 32bits dfs. And I think on the controller's side I haven't seen any supporting 64bits since. Many are stuck to 8bits only even. It does affect how you structure the buffers yes. An hypothetical 4bits dfs protocol on a spi device would require to use only 4 bits per-buffer byte. You cannot cram 2 frames on one byte, the api does not support such optimization (it would most likely require to shift the byte in the driver before giving it to the controller, it's a waste of time). And anyway since we make spi device drivers to be portable, they are using the most common controller dfs supported and buffers are meant to follow this. Actually, I quickly looked, there is only one driver on a spi device that uses 16 bits as word size.

decsny commented 5 months ago

I see the maximum data frame size in zephyr is 64 bits, is this because it's expected the register width will be up to 64 bits on some platforms? My question basically is, when I implement this driver, based on what I described, would it make sense to correlate the zephyr frame size to the LPSPI word size, and choose the lpspi "frame" size to be whatever is most convenient? Does the data frame size in the zephyr API impose/imply any structure on the contents of the spi buffers?

The dfs size is all about the spi device you are dealing with and what the controller can support. When this API was designed, there were no identified devices or controllers able to deal with more than 32bits dfs. And I think on the controller's side I haven't seen any supporting 64bits since. Many are stuck to 8bits only even. It does affect how you structure the buffers yes. An hypothetical 4bits dfs protocol on a spi device would require to use only 4 bits per-buffer byte. You cannot cram 2 frames on one byte, the api does not support such optimization (it would most likely require to shift the byte in the driver before giving it to the controller, it's a waste of time). And anyway since we make spi device drivers to be portable, they are using the most common controller dfs supported and buffers are meant to follow this. Actually, I quickly looked, there is only one driver on a spi device that uses 16 bits as word size.

Given this, can I make the assumption that the data frames will be multiples of 8 bits? Because that would greatly simplify the implementation. LPSPI supports odd numbers of bits to be in the frame all the way up to 4K as I mentioned, and our HAL driver has a lot of control logic to account for this, if I can just make the assumption that the data frame / word size in the buffers from the zephyr API will be 8, 16, or 32 bits that would greatly simplify things. Is it expected by for the dfs in the operation field to ever be something like 9, 15, 27 or anything weird like this in zephyr?

BTW, the minimum word size for the LPSPI is 2 bits :o

tbursztyka commented 5 months ago

in practice expect multiple of 8bits yes. Many controllers or HAL, afaik, do not propose the possibility of a finer grain config of the word size. It was meant to be flexible, but the reality is that if you want to make portable code you need to go for the most commonly understood config.

github-actions[bot] commented 3 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

ofirshe commented 2 months ago

I'm encountering the same issue as I described here https://github.com/zephyrproject-rtos/zephyr/discussions/77999. Any suggestions on how to resolve it ?

decsny commented 2 months ago

I'm encountering the same issue as I described here #77999. Any suggestions on how to resolve it ?

NXP is planning to contribute some rework of the lpspi driver soon, there is no quick fix, the existing driver just doesn't meet the zephyr api and is broken in a lot of ways. This issue is actually boiling over lately on a lot of fronts so it will be addressed soon

ofirshe commented 2 months ago

Thanks for your response, @decsny. I hope this issue gets resolved soon.

github-actions[bot] commented 3 weeks ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

ofirshe commented 3 days ago

Hey,

Could you please let us know when NXP plans to address this issue? Currently, the driver defaults to relying on the hardware CS mode, which may not function as expected for new users attempting to utilize it.

Thanks.

zephyrproject-rtos / zephyr

drivers: spi: spi_mcux_lpspi: inconsistent chip select behaviour #16544

26213 using GPIO CS rather than the device's slave select

26371 a fix for SS device remaining low until frame ends, undoes parts of the previous pull