[RFC] Introduction to MSPI (Multi-bit SPI) driver API

swift-tk commented 5 months ago

Introduction

Rather than attempting to overhaul the existing SPI interface, adding a new MSPI API may be a better option for both new and existing users.

Problem description

The existing Zephyr SPI has many limitations including but not limited to

no support for SDR mixed mode configuration such as 1-4-4, 1-1-4, 1-8-8, 1-1-8. (cmd-addr-data line numbers)
no support for HEX line mode.
no support for DDR mode configuration.
no support for dynamically managing dummy cycles before data phase.
no support for dynamically managing cmd and address phase length.
no support for configuring the controller hardware other than before the transfer happens. (Such as timing configurations)
supports only transfer related callback.
no support for DQS mode configuration.

Proposed change

The MSPI interface should contain a controller driver that is SoC platform specific and implements the following APIs. The device driver should then reference these APIs so that a unified device driver can be achieved.

__syscall int mspi_config(const struct mspi_dt_spec *spec);

__syscall int mspi_dev_config(const struct device *controller,
                                const struct mspi_dev_id *dev_id,
                                const enum mspi_dev_cfg_mask param_mask,
                                const struct mspi_dev_cfg *cfg);

__syscall int mspi_get_channel_status(const struct device *controller,
                                         uint8_t ch);

__syscall int mspi_transceive(const struct device *controller,
                                const struct mspi_dev_id *dev_id,
                                const struct mspi_xfer_packet *req);

__syscall int mspi_transceive_async(const struct device *controller,
                                      const struct mspi_dev_id *dev_id,
                                      const struct mspi_xfer_packet *req);

__syscall int mspi_register_callback(const struct device *controller,
                                        const struct mspi_dev_id *dev_id,
                                        const enum mspi_bus_event evt_type,
                                        mspi_callback_handler_t cb,
                                        struct mspi_callback_context *ctx);

__syscall int mspi_xip_config(const struct device *controller,
                                const struct mspi_dev_id *dev_id,
                                const struct mspi_xip_cfg *cfg);

__syscall int mspi_scramble_config(const struct device *controller,
                                      const struct mspi_dev_id *dev_id,
                                      const struct mspi_scramble_cfg *cfg);

__syscall int mspi_timing_config(const struct device *controller,
                                   const struct mspi_dev_id *dev_id,
                                   const uint32_t param_mask,
                                   void *cfg);

Note: mspi_register_callback to removed from the list of syscalls.

Detailed RFC

Methodology

To better serve the modern-day memory devices, I have divided configurations into three categories:

Controller hardware configurations: master/slave, duplexity. These common configurations are defined in struct mspi_cfg which is part of struct mspi_dt_spec. They are to be used for instance initialization or run-time re-initialization of the controller hardware. The common configurations are also defined in mspi-controller.yaml as a standard and should be referenced by all MSPI controller drivers’ bindings. For example, the ambiq,mspi-controller.yaml includes mspi-controller.yaml and has additional properties exclusive to Ambiq MSPI hardware/hal.

struct mspi_dt_spec {
    /** @brief MSPI bus */
    const struct device     *bus;
    /** @brief MSPI hardware specific configuration */
    struct mspi_cfg         config;
};

struct mspi_cfg {
    /** @brief mspi channel number */
    uint8_t                 ui8MSPIChannel;
    /** @brief Configure operaton mode */
    enum mspi_op_mode       eOPMode;
    /** @brief Configure duplex mode */
    enum mspi_duplex        eDuplex;
    /** @brief DQS support flag */
    bool                    bDQS;
    /** @brief GPIO chip-select line (optional) */
    struct gpio_dt_spec     *pCE;
    /** @brief Slave number from 0 to host controller slave limit. */
    uint32_t                ui32SlaveNum;
    /** @brief Maximum supported frequency in MHz */
    uint32_t                ui32MaxFreq;
    /** @brief Whether to re-initialize controller */
    bool                    bReinit;
};

Device specific configurations: These common settings are to be derived from device datasheet and should not change from one controller to another. They are defined in mspi_dev_cfg which is typically used in device driver initialization and run-time re-configuration. The common settings are also defined in mspi-device.yaml as a standard and should be referenced by all MSPI device drivers’ bindings. For example, the ambiq,mspi-device.yaml includes mspi-device.yaml and has additional timing properties only for Ambiq MSPI.

enum mspi_io_mode {
    MSPI_IO_MODE_SINGLE         = 0,
    MSPI_IO_MODE_DUAL           = 1,
    MSPI_IO_MODE_DUAL_1_1_2     = 2,
    MSPI_IO_MODE_DUAL_1_2_2     = 3,
    MSPI_IO_MODE_QUAD           = 4,
    MSPI_IO_MODE_QUAD_1_1_4     = 5,
    MSPI_IO_MODE_QUAD_1_4_4     = 6,
    MSPI_IO_MODE_OCTAL          = 7,
    MSPI_IO_MODE_OCTAL_1_1_8    = 8,
    MSPI_IO_MODE_OCTAL_1_8_8    = 9,
    MSPI_IO_MODE_HEX            = 10,
};

struct mspi_dev_cfg {
    /** @brief Configure CE0 or CE1 */
    uint32_t                ui32CENum;
    /** @brief Configure frequency */
    uint32_t                ui32Freq;
    /** @brief Configure I/O mode */
    enum mspi_io_mode       eIOMode;
    /** @brief Configure data rate SDR/DDR */
    enum mspi_data_rate     eDataRate;
    /** @brief Configure clock polarity and phase*/
    enum mspi_cpp_mode      eCPP;
    /** @brief Configure transfer endian */
    enum mspi_endian        eEndian;
    /** @brief Configure chip enable polarity */
    enum mspi_ce_polarity   eCEPolarity;
    /** @brief Configure DQS mode */
    bool                    bDQSEnable;
    /** @brief Configure number of clock cycles between
     * addr and data in RX direction
     */
    uint32_t                ui32RXDummy;
    /** @brief Configure number of clock cycles between
     * addr and data in TX direction
     */
    uint32_t                ui32TXDummy;
    /** @brief Configure read instruction */
    uint32_t                ui32ReadInstr;
    /** @brief Configure write instruction */
    uint32_t                ui32WriteInstr;
    /** @brief Configure instruction length */
    uint16_t                ui16InstrLength;
    /** @brief Configure address length */
    uint16_t                ui16AddrLength;
    /** @brief Configure memory boundary */
    uint32_t                ui32MemBoundary;
    /** @brief Configure break time */
    uint32_t                ui32BreakTimeLimit;
};

additional hardware features: XIP and scrambling. These may be special features that are not shared among device or controller hardware. In that case, they can be left aside.

struct mspi_xip_cfg {
    /** @brief XIP enable */
    bool                    bEnable;
    /** @brief XIP region start address =
     * hardware default + address offset
    */
    uint32_t                ui32AddrOffset;
    /** @brief XIP region size */
    uint32_t                ui32Size;
    /** @brief XIP access permission */
    enum mspi_xip_permit    ePermission;
};
/**
 * @brief MSPI controller scramble configuration
 */
struct mspi_scramble_cfg {
    /** @brief scramble enable */
    bool                    bEnable;
    /** @brief scramble region start address =
     * hardware default + address offset
    */
    uint32_t                ui32AddrOffset;
    /** @brief scramble region size */
    uint32_t                ui32Size;
};

API Detail

Now let's dive deeper and checkout what each API should do.

mspi_config

 * @param controller Pointer to the device structure for the driver instance.
 * @param cfg The controller configuration for MSPI.

This routine provides a generic interface to override MSPI controller capabilities. In the controller driver, one may implement this API to initialize or re-initialize their controller hardware. Additional SoC platform specific settings that are not in struct mspi_cfg may be added to one's own binding(xxx,mspi-controller.yaml) so that one may derive the settings from DTS and configure it in this API. In general, these settings should not change during run-time.

mspi_dev_config

 * @param controller Pointer to the device structure for the driver instance.
 * @param dev_id Pointer to the device ID structure from a device.
 * @param param_mask Macro definition of what to be configured in cfg.
 * @param cfg The device runtime configuration for the MSPI controller.

This routine provides a generic interface to override MSPI controller device specific settings. With struct mspi_dev_id defined as the device index and CE GPIO from device tree, the API supports multiple devices on the same controller instance. It is up to the controller driver implementation whether to support device switching either by software or by hardware. The implementation may also support individual parameter configurations specified by enum mspi_dev_cfg_mask. The settings within struct mspi_dev_cfg don't typically change once the mode of operation is determined after the device initialization.

An example of the DTS.

&mspi1 {
    compatible = "ambiq,mspi-controller";
    pinctrl-0 = <&mspi1_default>;
    pinctrl-1 = <&mspi1_sleep>;
    pinctrl-2 = <&mspi1_psram>;
    pinctrl-3 = <&mspi1_flash>;
    pinctrl-names = "default","sleep","psram","flash";
    status = "okay";

    ce-gpios = <&gpio64_95 5 GPIO_ACTIVE_LOW>,
               <&gpio32_63 18 GPIO_ACTIVE_LOW>;

    cmdq-buffer-location = ".mspi_buff";
    cmdq-buffer-size = <256>;

    aps6404l: aps6404l@0 {
        compatible = "mspi-aps6404l";
        size = <DT_SIZE_M(64)>;
        reg = <0>;
        status = "okay";
        mspi-max-frequency = <48000000>;
        mspi-io-mode = "MSPI_IO_MODE_QUAD";
        mspi-data-rate = "MSPI_SINGLE_DATA_RATE";
        hardware-ce-num = <0>;
        read-instruction = <0xEB>;
        write-instruction = <0x38>;
        instruction-length = "INSTR_1_BYTE";
        address-length = "ADDR_3_BYTE";
        rx-dummy = <6>;
        tx-dummy = <0>;
        xip-config = <1 0 0 0>;
        ce-break-config = <0x6 30>;
        ambiq,timing-config-mask = <3>;
        ambiq,timing-config = <0 6 0 0 0 0 0 0>;
    };
    aps6408l: aps6408l@1 { /** dummy device */
        compatible = "mspi-aps6404l";
        size = <DT_SIZE_M(64)>;
        reg = <1>;
        status = "disabled";
        mspi-max-frequency = <48000000>;
        mspi-io-mode = "MSPI_IO_MODE_OCTAL";
        mspi-data-rate = "MSPI_SINGLE_DATA_RATE";
        hardware-ce-num = <0>;
        read-instruction = <0xEB>;
        write-instruction = <0x38>;
        instruction-length = "INSTR_1_BYTE";
        address-length = "ADDR_3_BYTE";
        rx-dummy = <6>;
        tx-dummy = <0>;
        xip-config = <1 0 0 0>;
        ce-break-config = <0x6 30>;
        ambiq,timing-config-mask = <3>;
        ambiq,timing-config = <0 6 0 0 0 0 0 0>;
    };
};

mspi_get_channel_status

 * @param controller Pointer to the device structure for the driver instance.
 * @param ch the MSPI channel for which status is to be retrieved.

This routine provides a generic interface to check whether the hardware is busy. This is useful in the multiple slave devices scheme.

mspi_register_callback

 * @param controller Pointer to the device structure for the driver instance.
 * @param dev_id Pointer to the device ID structure from a device.
 * @param evt_type The event type associated the callback.
 * @param cb Pointer to the user implemented callback function.
 * @param ctx Pointer to the user callback context.

This routine provides a generic interface to register different types of bus events. The dev_id is provided so that the controller can identify its device and determine whether the access is allowed in a multiple device scheme. The enum mspi_bus_event is a preliminary list of bus events. There are XIP events that can be added. I encourage the community to come up with more events that they would use.

mspi_transceive/mspi_transceive_async

 * @param controller Pointer to the device structure for the driver instance.
 * @param dev_id Pointer to the device ID structure from a device.
 * @param req Content of the request and request specific settings.

This routine provides a generic interface to transfer a request synchronously/asynchronously. The dev_id is provided so that the controller can identify its device and determine whether the access is allowed in a multiple device scheme. The req is of type mspi_xfer_packet which allows for dynamically changing the transfer related settings once the mode of operation is determined and configured by mspi_dev_config. The API supports bulk transfers with different starting addresses and sizes with struct mspi_buf. However, it is up to the controller implementation whether to support scatter IO and callback management. The controller can determine which user callback to trigger based on enum mspi_bus_event_cb_mask upon completion of each async/sync transfer if the callback had been registered using mspi_register_callback. Or not to trigger any callback at all with MSPI_BUS_NO_CB even if the callbacks are already registered. In which case that a controller supports hardware command queue, user could take the full advantage of it in terms of performance if scatter IO and callback management are supported.

struct mspi_buf {
    /** @brief  Device Instruction           */
    uint16_t                    ui16DeviceInstr;
    /** @brief  Device Address               */
    uint32_t                    ui32DeviceAddr;
    /** @brief  Number of bytes to transfer  */
    uint32_t                    ui32NumBytes;
    /** @brief  Buffer                       */
    uint32_t                    *pui32Buffer;
    /** @brief  Bus event callback masks     */
    enum mspi_bus_event_cb_mask eCBMask;
};

struct mspi_xfer_packet {
    /** @brief  Transfer Mode                */
    enum eTransMode             eMode;
    /** @brief  Direction (Transmit/Receive) */
    enum eTransDirection        eDirection;
    /** @brief  Configure TX dummy cycles    */
    uint32_t                    ui32TXDummy;
    /** @brief  Configure RX dummy cycles    */
    uint32_t                    ui32RXDummy;
    /** @brief Configure instruction length  */
    uint16_t                    ui16InstrLength;
    /** @brief Configure address length      */
    uint16_t                    ui16AddrLength;
    /** @brief  Hold CE active after xfer    */
    bool                        bHoldCE;
    /** @brief  Software CE control          */
    struct mspi_ce_control      sCE;
    /** @brief  Priority 0 = Low (best effort)
     *                   1 = High (service immediately)
    */
    uint8_t                     ui8Priority;
    /** @brief  Transfer buffers             */
    struct mspi_buf             *pPayload;
    /** @brief  Number of transfer buffers   */
    uint32_t                    ui32NumPayload;
};

mspi_xip_config/mspi_scramble_config

 * @param controller Pointer to the device structure for the driver instance.
 * @param dev_id Pointer to the device ID structure from a device.
 * @param cfg The controller configuration for MSPI.

This routine provides a generic interface to configure the XIP and scrambling feature. Typically, the cfg parameter includes an enable and the range of address to take effect. I also wouldn't expect these settings to change often.

mspi_timing_config

 * @param controller Pointer to the device structure for the driver instance.
 * @param dev_id Pointer to the device ID structure from a device.
 * @param param_mask The macro definition of what should be configured in cfg.
 * @param cfg The controller timing configuration for MSPI.

This routine provides a generic interface to configure timing parameters that are SoC platform specific. If it is used, there should be one's own definition for param_mask and cfg type in one's own *.h file.

Dependencies

To compile, one needs to checkout “apollo3p-dev-mspi” at https://github.com/AmbiqMicro/ambiqhal_ambiq.git and "RFC-MSPI" at https://github.com/AmbiqMicro/ambiqzephyr.git

The branch is based of a PR that has yet to be merged to Zephyr main. https://github.com/zephyrproject-rtos/zephyr/pull/67815. Please look at these commits for the example implementations.

The API prototype is at https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/include/zephyr/drivers/mspi.h

Example code

Example implementation of the MSPI API can be found in this path zephyr\drivers\mspi\mspi_ambiq_ap3.c Example usage of the MSPI API can be found in the following files.

zephyr\drivers\memc\memc_mspi_aps6404l.c PSRAM APS6404L device driver zephyr\drivers\flash\flash_mspi_atxp032.c NOR FLASH ATXP032 device driver

zephyr\samples\drivers\memc\src\main.c demo the usage of mspi_transceive_async

github-actions[bot] commented 5 months ago

Hi @swift-tk! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙

carlescufi commented 5 months ago

Arch WG:

@swift-tk explains the reasoning behind adding a new API
- no support for multiple devices on the same controller instance.
- no support for configuring the controller hardware other than before the transfer happens. (Such as timing configurations)
- supports only transfer related callback.
@erwango agrees with the author in that it may make sense to add a new API. It mainly targets the new memory devices, flash and RAM
@erwango also states that peripherals are different in the hardware itself in general
@danieldegrasse agrees with the above, and says that Linux uses the mtd subsystem for this. Also agrees with the idea of having a new API
We could add an mpsi_nor.c in drivers/flash, removing the current mess of qspi-specific flash driver
@nashif suggests introducing this new driver and slowly replacing the spi one with it

tbursztyka commented 5 months ago

I could not attend the meeting (it's really not the right time for me), but has anyone noted that this is again a proposal for dual/quad/octal modes in addition to other things? So already, is there somebody really interested to get such modes as a generic SPI feature exposed? (See PR #39991). We could avoid having a new API altogether.

cfriedt commented 5 months ago

@tbursztyka - I also like the idea of supporting dummy clocks before a data phase. Specifically, because some peripherals (e.g. fpgas) use it for handshaking, and I think that is missing from the current api

swift-tk commented 5 months ago

@tbursztyka I’d like to emphasize that the proposed MSPI API would drive a different hardware peripheral from the SPI. In terms of functionality, this is very much like I3C over I2C : The MSPI could do everything that SPI does. I would suggest to not expanding the SPI and keep its focus on serial(one lane) and low speed peripheral support. Then introduce a new API that is multi-lane and focus on supporting advanced memory devices.

As far as I can tell, it is hard to overhaul the SPI API and that is why everybody just implement their own HAL abstraction controller and have their own device drivers call the controller. This I mean the qspi, ospi from st, qspi from Nordic, flexspi from nxp. I’ve looked into each of these controller implementations, and I think that there are a lot of things in common and that is what I did in my new API proposal.

I also would like to point like that there are multiple other things missing from the SPI API. The SPI lacks ways to configure the hardware when not sending a transfer. This is useful to configure the hardware before entering XIP mode. The advanced SPI devices demand splitting transfer into command, addr and data phases and often time dummy cycles between addr phase and data phase. This is not needed for other generic SPIs.

Cmd length, addr length, dummy cycles. These can change with different types of commands and operating frequencies for advanced memory devices.

There is no multiple device support and callback management as the generic SPI would not need such things.

There can be multiple memory devices on the same hardware instance. Some is supported natively through hardware, and for some that does not, it can be supported through software.
Since the hardware is more sophisticated than common SPI, it could raise different interrupts even when we are not explicitly calling for a transfer. E.g. XIP events.

I forgot to mention that the SPI does not support DQS configuration which is required for high speed devices.

tbursztyka commented 5 months ago

I also like the idea of supporting dummy clocks before a data phase. Specifically, because some peripherals (e.g. fpgas) use it for handshaking, and I think that is missing from the current api

@cfriedt Isn't this just a certain amount of 0s "being sent" at certain places or is there more to this? Because sending/receiving dummy bytes in generic SPI is something straightforward via the buffer sets. But, I get that as a generic feature, we may need something on a slightly higher level that would allow configuring such dummy clocks (via DTS etc..) . Well see below how I see things

@swift-tk Thanks for the details. When it comes to dedicated qspi devices (such as nordic's etc..), this was the main reason why at the time we did not integrate dual/quad/octal into generic SPI API because these controllers offers hardware optimizations the generic API would not be able to take full advantage of, like XIP or JEDEC etc... (btw, have a quick look at #20564, as you can see it's not a recent issue).

Do I understand your proposal well so that you would like to replace all dedicated qspi, ospi, ... device drivers, found in flash currently, into ones implementing MSPI API? And then there would (ideally) be a unique flash_mspi.c driver of some sort ?

My own concern has always been that one: are there some use cases where a generic SPI controller would be driving single mode devices and dual/quad/octal based devices (most likely a flash memory device)? Does that type of controller even exist ? So far I never could get my hand on such controller. I know that Design Ware has such hardware (though besides the specs I could not get any). Are there more out there? (I guess there are or such issue #51402 would not have popped up)

Anyway, I see things that way at the moment, sticking with the flash use case: (I hope my crappy asci art shows up the same way everywhere)

         FLASH device driver 
           (flash_mspi.c)
                  /\
                  \/
               MSPI API  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
           /             \                              \
         /                \                              \
dedicated qspi      dedicated ospi                Generic SPI API
  controller          controller                          |
                                                       generic SPI
                                                        controller

Btw, doing that you could then - and only then as there is no other way round (unless MSPI API would have to be exposed by ALL existing SPI drivers...) - remove spi_nor.c also. Since below the MSPI API, it would use the generic SPI API when relevant (through a dedicated driver, mspi_generic_spi.c for instance).

I hope I am clear enough.

swift-tk commented 5 months ago

@tbursztyka

@cfriedt Isn't this just a certain amount of 0s "being sent" at certain places or is there more to this? Because sending/receiving dummy bytes in generic SPI is something straightforward via the buffer sets. But, I get that as a generic feature, we may need something on a slightly higher level that would allow configuring such dummy clocks (via DTS etc..) . Well see below how I see things

Certainly you could construct the buffer such that the dummy cycles are between addr and data. However, there are two problems I can see, one being that it is ambiguous. Thus in the controller layer, there is no way to know whether the 0s are actually data or dummies(as well as cmd and addr length), unless the hardware is configured to be used as a generic SPI, in which case, it does not care. This leads to the second problem being not efficient enough for advanced SPI hardware(qspi and etc) because often times the transfers are initiated as many times as the number of buffers. I believe this is a part of what you mentioned about hardware optimization? (as the advanced hardware could just initiate one transfer for everything)

Do I understand your proposal well so that you would like to replace all dedicated qspi, ospi, ... device drivers, found in flash currently, into ones implementing MSPI API? And then there would (ideally) be a unique flash_mspi.c driver of some sort ?

Yes, your understanding is correct, the device drivers(flash_mspi_device_name.c) that uses the proposed MSPI API could eventually replace all platform specific device drivers. But controller layer would have to be rewritten to implement the proposed API.

My own concern has always been that one: are there some use cases where a generic SPI controller would be driving single mode devices and dual/quad/octal based devices (most likely a flash memory device)? Does that type of controller even exist ? So far I never could get my hand on such controller. I know that Design Ware has such hardware (though besides the specs I could not get any). Are there more out there? (I guess there are or such issue https://github.com/zephyrproject-rtos/zephyr/issues/51402 would not have popped up)

I haven't seen anything exactly that in Zephyr repo, but I do know that SPI API is used to initialize the flash device in the espi example. However, I don't know enough about Intel eSPI to tell what is going on next. This https://github.com/zephyrproject-rtos/zephyr/issues/51402 talked about a device right? not a controller? There are actually many flash memory devices that require a serial(one lane) start up initialization process, some even have to be switched back to serial mode to send commands. These memory devices are under JESD251C profile 1.0. My example implementation here https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/drivers/flash/flash_mspi_atxp032.c shows exaclty that kind of flow. In any case, a device may still requires cmd, addr, dummy and data phase with different length when using different commands even in serial mode. So using the generic SPI would still be a pain in the ass.

If there were such a controller hardware, the proposed API could still cover that by setting mspi_dev_cfg/mspi_xfer_packet.ui16InstrLength/ , mspi_dev_cfg/mspi_xfer_packet.ui16AddrLength and dummies to zero to indicate a generic SPI operation. If so desired to maintain backward compability, the controller driver could do the translation and call the old spi controller driver.(applies to what you drew on the far right). If not, it could just call the platform HAL(applies to what you drew on the left).

With that being said, I think calling the SPI API from MSPI API seems to be an overkill. It would only spend more processing time and not nearly efficient.

But I could definitely see that the SPI and MSPI would co-exist for quite a while. And then Zephyr may slowly deprecate the SPI API.

cfriedt commented 5 months ago

One of the issues with e.g. the lattice ice40 series was the need to invert /CS polarity mid-sequence as a kind of handshake.

It's a definite quirk. Currently it's bit-banged with gpio because the old Zephyr SPI API does not have a way to make that happen synchronously within timing requirements. It could maybe be done as a bit of a board hack / platform driver, but doing it with the SPI API would be better.

teburd commented 5 months ago

A few comments here...

mspi_register_callback is marked as a syscall, callbacks are not usable with syscalls
async implies a queue, how is the completion of the request notified?

I'm pushing to move i2c/spi to use a generalized command queue + completion queue API (rtio) like io_uring or iocp as it...

allows for batching
allows for user space thread usage
can provide both sync/async usage
potentially allows for timeouts/deadlines to be tied to requests

I'm wondering if this would best be served by using a common API for all of this as its already being used in an updated sensors API, already has tests verifying transactional behavior, chaining, and cancellation.

Cheers!

swift-tk commented 5 months ago

@teburd See this paragraph.

The controller can determine which user callback to trigger based on enum mspi_bus_event_cb_mask upon completion of each async/sync transfer if the callback had been registered using mspi_register_callback. Or not to trigger any callback at all with MSPI_BUS_NO_CB even if the callbacks are already registered. In which case that a controller supports hardware command queue, user could take the full advantage of it in terms of performance if scatter IO and callback management are supported.

The mspi_register_callback just registers the callback and callback context. The controller driver should save it in accordance with the event type. I have xfer complete event type defined and the controller may use that to notify async completion. The demo usage of the async call can be found here https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/samples/drivers/memc/src/main.c

I did not use a software queue in aync implementation as our controller has hardware command queue.

Do you have code I could look at? It could certainly be a supplement for those that don’t have hardware command queue. I think it may be better to take similar forms as spi_context.h so that the interface API doesn’t force people to use a software queue when not needed.

teburd commented 5 months ago

Let me reiterate that having callbacks be the primary event notifier here will not work with user space and register_callback can't be a syscall, I don't think that's being fully appreciated or understood.

This makes me believe the register_callback is stateful and connected to the next transfer?

mspi_register_callback(controller, dev_id, MSPI_BUS_XFER_COMPLETE, (mspi_callback_handler_t)async_cb, &cb_ctx1);
mspi_transceive_async(controller, dev_id, &pack1);

mspi_register_callback(controller, dev_id, MSPI_BUS_XFER_COMPLETE, (mspi_callback_handler_t)async_cb, &cb_ctx2);
mspi_transceive_async(controller, dev_id, &pack2);

Is that right? That's a bit suspicious, is it expected the user of an mspi device maintain an external lock on it?

And you'd like those two async transfers to occur back to back presumably, without something else taking control of the bus controller?

How rtio looks using spi might be...

struct rtio_sqe *sqe;
struct rtio_cqe *cqe;
struct rtio_iodev *spi_device = /* MACRO here... */

sqe = rtio_sqe_acquire(ctx);
rtio_prep_write(sqe, spi_device, buf, buf_len);
sqe->flags |= RTIO_TRANSACTION;
sqe = rtio_sqe_acquire(ctx);
rtio_prep_write(ctx, spi_device, buf, buf_len);
rtio_submit(ctx, 2); /* submit to the spi controller a write and read, block waiting on 2 completions, if 0 is given the requests are started but the call should not block */
cqe = rtio_consume_cqe(ctx); /* could block here instead if you want */
/* cqe->result has a int result code */
if (cqe->result != 0) {
  LOG_ERR("write+read to spi device failed");
}

Ideally every rtio sqe would turn into a hardware command/dma transer. For lpspi this isn't quite true as the mcux sdk doesn't provide direct low level access to the hardware command queue today.

The iodev can contain configuration specifics, in this case its the chip select, struct device *, clock configuration, and polarity settings typical of spi.

tbursztyka commented 5 months ago

One of the issues with e.g. the lattice ice40 series was the need to invert /CS polarity mid-sequence as a kind of handshake.

It's a definite quirk. Currently it's bit-banged with gpio because the old Zephyr SPI API does not have a way to make that happen synchronously within timing requirements. It could maybe be done as a bit of a board hack / platform driver, but doing it with the SPI API would be better.

That would definitely be an improvement for the SPI API, go ahead make a dedicated issue, I'll look into this.

tbursztyka commented 5 months ago

@swift-tk

Certainly you could construct the buffer such that the dummy cycles are between addr and data. However, there are two problems I can see, one being that it is ambiguous. Thus in the controller layer, there is no way to know whether the 0s are actually data or dummies(as well as cmd and addr length), unless the hardware is configured to be used as a generic SPI, in which case, it does not care. This leads to the second problem being not efficient enough for advanced SPI hardware(qspi and etc) because often times the transfers are initiated as many times as the number of buffers. I believe this is a part of what you mentioned about hardware optimization? (as the advanced hardware could just initiate one transfer for everything)

I was bringing this up only in the case were a generic SPI controller would be in use (so no hardware features exposed to simplify that). Obviously this does not fit with dedicated controllers. Thus why I agreed on a configuration exposed via DTS (not for SPI API, so yes MSPI would do). Then, below, it's just a matter of a driver using it relevantly: a dedicated controller would use the hw optimizations, a generic SPI controller would mimic by inserting the right length buffer of 0 at proper places.

Do I understand your proposal well so that you would like to replace all dedicated qspi, ospi, ... device drivers, found in flash currently, into ones implementing MSPI API? And then there would (ideally) be a unique flash_mspi.c driver of some sort ?

Yes, your understanding is correct, the device drivers(flash_mspi_device_name.c) that uses the proposed MSPI API could eventually replace all platform specific device drivers. But controller layer would have to be rewritten to implement the proposed API.

Ok but why saying that then:

But I could definitely see that the SPI and MSPI would co-exist for quite a while. And then Zephyr may slowly deprecate the SPI API.

So far, MSPI seems fully designed over a very minimal subset of use cases which is related to d/q/o, xip and jedec targeting flash memories. Bringing a lot of features and configuration bits that would not fit 99% of other use cases (thus a lot of overhead, useless rom/ram usage - seriously the proposed structures are super heavy - etc...).

It's confusing. What's the actual focus here? Finally fixing d/q/r, xip, jedec support in SPI or replacing SPI API?

tbursztyka commented 5 months ago

I did not reply on that, but looking at previous statement, now is the time:

@nashif suggests introducing this new driver and slowly replacing the spi one with it

That would mean implementing the proposed API in all existing SPI drivers. (knowing that the proposed API brings nothing to single line SPI mode - the 99.999% usage atm). Is that what's being wanted ?

erwango commented 5 months ago

That would mean implementing the proposed API in all existing SPI drivers. (knowing that the proposed API brings nothing to single line SPI mode - the 99.999% usage atm). Is that what's being wanted ?

Not sure this is something desirable. To me these both API address different use cases and controllers. We should keep them exclusive.

swift-tk commented 5 months ago

@teburd I get your point, the syscall will be removed from mspi_register_callback.

This makes me believe the register_callback is stateful and connected to the next transfer?

Not necessarily. The example is a bit bias by Ambiq HAL. The user could just register the callback and the context once if there is no need to change it throughout the program, but I guess that is generally not the case. So I shown changing the context per async call.

BTW, one async call don't necessarily mean to prepare the hardware to start only one transfer. There are use cases where each transfer requires a callback. Thats why I have eCBMask inside of struct mspi_buf. e.g. Partial refreshing of SPI display. The callback is used to release the lock to the part of memory that had been transferred(flushed) and enables refilling with new data, so that optimial performance can be achieved.

Is that right? That's a bit suspicious, is it expected the user of an mspi device maintain an external lock on it?

Whether it needs an external lock is entirely up to the controller implementation. In my case (Ambiq MSPI), it is not need as there are two internal locks. One for transfer context, one for controller access. Our HAL is setup to record the callback and context per transfer and call them after transfer completes. So just for my case, I would just need a mspi_transceive_signal_setup to call the mspi_register_callback for asynchronous notification.

And you'd like those two async transfers to occur back to back presumably, without something else taking control of the bus controller?

It would occur back to back as it writes to hardware command queue and does not wait for the previous to complete. The controller driver would have to be written in a way that blocks others (other device / next transfer) from accessing the controller. See here example implementation of the controller driver. https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/drivers/mspi/mspi_ambiq_ap3.c

I like the RTIO idea, it can certainly be supplemented to MSPI API the way you did for SPI. But the mspi_register_callback is not used solely for the async call as I indicated here:

mspi_register_callback

 * @param controller Pointer to the device structure for the driver instance.
 * @param dev_id Pointer to the device ID structure from a device.
 * @param evt_type The event type associated the callback.
 * @param cb Pointer to the user implemented callback function.
 * @param ctx Pointer to the user callback context.
This routine provides a generic interface to register different types of bus events. The dev_id is provided so that the controller can identify its device and determine whether the access is allowed in a multiple device scheme. The enum mspi_bus_event is a preliminary list of bus events. There are XIP events that can be added. I encourage the community to come up with more events that they would use.

I'd like to keep this RFC as simple as possible as the primiary focus is to provide a generic API for advanced SPI devices that the generic SPI fails to achieve.

swift-tk commented 5 months ago

@tbursztyka

But I could definitely see that the SPI and MSPI would co-exist for quite a while. And then Zephyr may slowly deprecate the SPI API.

So far, MSPI seems fully designed over a very minimal subset of use cases which is related to d/q/o, xip and jedec targeting flash memories. Bringing a lot of features and configuration bits that would not fit 99% of other use cases (thus a lot of overhead, useless rom/ram usage - seriously the proposed structures are super heavy - etc...).

It's confusing. What's the actual focus here? Finally fixing d/q/r, xip, jedec support in SPI or replacing SPI API?

I would say neither, the primary focus is to provide a generic API for advanced SPI devices that the generic SPI fails to achieve.

My opinion is that the generic SPI could no longer serve for interfacing with advanced SPI devices in terms of overall performance. By the way, the advanced SPI usage is not that small and believe me, I did not just come up with this without actually trying to drive those advanced SPI devices with the generic SPI. In the cases of transfering cmd or small data size, the prepartion time is even longer than the actually transfer time when using the generic SPI.

I just said the proposed MSPI API is capable of doing what SPI does, whether it can replace SPI is entirely a different matter. Like you said, it has more configuration that generic SPI devices don't need, then let people use the generic SPI for simple SPI devices. But leave the complex and heavy lifting interfacing to a dedicited API. There is no reason to force people to use the generic SPI for complex devices.

Also, the industry have evolved, some common SPI controllers in SoCs are upgraded to have cmd and data phases. This means for RX operation, it no longer needs to start a TX first and then switch to RX because it can be done through hardware. But the generic SPI treats everything as data which is not ideal. You would also need software CE/CS control for the generic SPI while the upgraded controllers does not.

I don't see a way around having more complex data structures as the SPI controller and devices grows more complex. Personally, one of the problems I find with SPI API is that it is not really expandable.

swift-tk commented 5 months ago

@erwango I agree with you, we don't need to implement the MSPI API for all legacy SPI device drivers. @tbursztyka Thats why I said MSPI and SPI could co-exist. If all your devices are generic SPI, then there is no reason to switch to MSPI. However, the state of drivers under flash, memc and possibly other subsystems calls for a generic API as everybody is not using generic SPI but their own HAL and those device drivers are all platform specific.

swift-tk commented 5 months ago

@tbursztyka Sorry I had to keep editing my responses.

Certainly you could construct the buffer such that the dummy cycles are between addr and data. However, there are two problems I can see, one being that it is ambiguous. Thus in the controller layer, there is no way to know whether the 0s are actually data or dummies(as well as cmd and addr length), unless the hardware is configured to be used as a generic SPI, in which case, it does not care. This leads to the second problem being not efficient enough for advanced SPI hardware(qspi and etc) because often times the transfers are initiated as many times as the number of buffers. I believe this is a part of what you mentioned about hardware optimization? (as the advanced hardware could just initiate one transfer for everything)

I was bringing this up only in the case were a generic SPI controller would be in use (so no hardware features exposed to simplify that). Obviously this does not fit with dedicated controllers. Thus why I agreed on a configuration exposed via DTS (not for SPI API, so yes MSPI would do). Then, below, it's just a matter of a driver using it relevantly: a dedicated controller would use the hw optimizations, a generic SPI controller would mimic by inserting the right length buffer of 0 at proper places.

Just having the configurations through DTS is just not enough. Thoses settings may change in run-time depending on the command, mode of operation(mspi_io_mode/data rate), and the frequency the device is operating on.

teburd commented 5 months ago

@teburd I get your point, the syscall will be removed from mspi_register_callback.

This makes me believe the register_callback is stateful and connected to the next transfer?

Not necessarily. The example is a bit bias by Ambiq HAL. The user could just register the callback and the context once if there is no need to change it throughout the program, but I guess that is generally not the case. So I shown changing the context per async call.

BTW, one async call don't necessarily mean to prepare the hardware to start only one transfer. There are use cases where each transfer requires a callback. Thats why I have eCBMask inside of struct mspi_buf. e.g. Partial refreshing of SPI display. The callback is used to release the lock to the part of memory that had been transferred(flushed) and enables refilling with new data, so that optimial performance can be achieved.

Is that right? That's a bit suspicious, is it expected the user of an mspi device maintain an external lock on it?

Whether it needs an external lock is entirely up to the controller implementation. In my case (Ambiq MSPI), it is not need as there are two internal locks. One for transfer context, one for controller access. Our HAL is setup to record the callback and context per transfer and call them after transfer completes. So just for my case, I would just need a mspi_transceive_signal_setup to call the mspi_register_callback for asynchronous notification.

And you'd like those two async transfers to occur back to back presumably, without something else taking control of the bus controller?

It would occur back to back as it writes to hardware command queue and does not wait for the previous to complete. The controller driver would have to be written in a way that blocks others (other device / next transfer) from accessing the controller. See here example implementation of the controller driver. https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/drivers/mspi/mspi_ambiq_ap3.c

How is exclusive usage of the device done then between multiple calling contexts?

E.g. multiple threads running concurrently or preempting eachother, or perhaps even in an ISR wanting to start an async transfer that is caught later. What happens if register_callback called from context A while an async request is on-going started by context B which had previously registered a call back?

E.g. the flow goes...

thread A: register_callback thread A: transceive_async thread B: register_callback ISR: callback called for thread A's async transceive completion, thread B believes transceive done?

It's not very clear to me at least how all this is dealt with. Its even less clear how this might work out where you could start an async transfer from an ISR which turns out to be a really useful thing for things like sensors at least.

swift-tk commented 5 months ago

@teburd

How is exclusive usage of the device done then between multiple calling contexts?

Once again, this is entirely up to the controller driver implementation on how to tackle threading. What I took is a similar approach that is in spi_context.h

E.g. multiple threads running concurrently or preempting eachother, or perhaps even in an ISR wanting to start an async transfer that is caught later. What happens if register_callback called from context A while an async request is on-going started by context B which had previously registered a call back?

In this situation for my case, I have a local struct mspi_context that contains copy of the async request from B. The copy including callback and callback context, which are brought in before the lock happens and saved to struct mspi_context after B obtaining the lock.

This is very similar to that of spi_context_lock. The lock is not going to get released until all transfers gets queue in the hardware queue. So the register_callback from A would not affect what is in struct mspi_context. Because of the fact that our HAL also records the callback and callback context along with that transfer, the callback and callback context would not get tampered for B.

If in the case which the hardware command queue does not take callback and callback context or no hardware command queue at all, implementing RTIO software queue can be helpful.

E.g. the flow goes...

thread A: register_callback thread A: transceive_async thread B: register_callback ISR: callback called for thread A's async transcieve completion, thread B believes transcieve done?

No, register_callback from B may have changed the local storage but not going to affect A because callback and callback context are in mspi_context

Now if we are only looking at the controller driver implementation, there is a chance for the callback and callback context local storage to get tampered. Execution flow below: thread A: register_callback thread B: register_callback thread A: transceive_async

However, this would not happen if the device driver is implemented correctly as there is always a third lock. Say for a read_async function in the device driver, I would expect the following.

acquire(device);
mspi_register_callback();
mspi_transceive_async();
release(device);

So thread B is not going get the device control and be able to call mspi_register_callback before thread A releases the device control.

tbursztyka commented 5 months ago

@swift-tk

the primary focus is to provide a generic API for advanced SPI devices that the generic SPI fails to achieve.

What are the "advanced SPI devices" out there? Besides memory devices I mean. Are there any other types?

This was the reason why, in 2019, it was decided not to adopt d/q/o and other features into SPI because it was much simple to just interface the flash API (or other memory APIs) directly. This is an important question because, if there are no other devices types (sensors or else), then what will be the benefit to add overhead via an in-between API such as the proposed one, versus what we have now in flash controllers for instance?

I disagree on "generic SPI fails to achieve" statement: as long as these advanced features are not supported into the API, obviously it will fail to use them.

Then comes another question: what blocks the possibility to improve the existing SPI API to enable those features versus creating an entirely new API?

nashif commented 5 months ago

I did not reply on that, but looking at previous statement, now is the time:

@nashif suggests introducing this new driver and slowly replacing the spi one with it

That would mean implementing the proposed API in all existing SPI drivers. (knowing that the proposed API brings nothing to single line SPI mode - the 99.999% usage atm). Is that what's being wanted ?

That was not my point at all. My point here is not to get into the discussion of how we can support this with current SPI API and end up getting nowhere. We had this before with many APIs and trying to over-complicate existing APIs to support new hardware and devices should not be the first option we think about. what I am saying here is, introduce mspi as a new API, if there is a path to add compatibility to existing spi, it should be done, if it is good enough to cover existing drivers we will only then consider folding everything under one api, but lets not spend too much time trying to make this work now impacting current (stable) API and the drivers implementing it.

swift-tk commented 5 months ago

@tbursztyka

What are the "advanced SPI devices" out there? Besides memory devices I mean. Are there any other types?

For instance, displays and sensors that are driven by multi-lane SPI. In fact, SPI devices that operates with high speed(>= 50MHz CLK) more or less shares the same characteristics. i.e. multi-lane, variable latency, DQS, io mode switching and etc.

This was the reason why, in 2019, it was decided not to adopt d/q/o and other features into SPI because it was much simple to just interface the flash API (or other memory APIs) directly. This is an important question because, if there are no other devices types (sensors or else), then what will be the benefit to add overhead via an in-between API such as the proposed one, versus what we have now in flash controllers for instance?

True, that may be the case 5 years ago. A lot of things have happened in 5 years in electronic business. Now categories of devices have grown really big in wearable, smartphone and IoT.

What Zephyr have now under the flash controller aside from SoC internal NV flash, I believe, is chaos. Each SoC vendors have to create their own controller abstraction with no standard whatsoever as SPI can not support those controllers, which nowadays should be common enough to form a standardlized API. I would argue that with any API, there would be some overhead. And I believe it is better to have an API for adhering to the idea that allows for easy switching between SoC platforms.

Then comes another question: what blocks the possibility to improve the existing SPI API to enable those features versus creating an entirely new API?

Well, it is like you said, there would be more overhead for those generic SPI devices that don't support thoes features. And the existing SPI API and MSPI drives different peripheral hardwares. I also think that it is impossible to not break the current SPI to achieve similar functions that the proposed MSPI does. Even if we somehow managed to maintain backward compability with SPI, there is probably going to be more overhead than bringing in a new API.

If I have to say, I particularly dislike the way that SPI passes spi_config when initiating a transfer. In my view, most of them are more device/hardware settings instead of transfer related things.

carlescufi commented 5 months ago

I think that, in order to counter the emergence of this new API, which clearly has uses, someone must propose an alternative where the current SPI API is extended to cover the use cases that this one does. @tbursztyka is this something you could take a look at?

carlescufi commented 5 months ago

@tbursztyka some of your questions relate to discussions that took place in the Arch WG meeting, including having a single mspi_nor.c in drivers/flash or the types of hardware peripherals that would be implemented by this new API. It's not a problem that you cannot attend, but could you take a look at the recording? There was input from ST and NXP as well that positively acknowledge the differences in hardware between standard SPI and "advanced SPI" peripherals.

Here is the recording: https://zoom.us/rec/share/KHaMiI1O7r7vmQdSvXp-LnnSRIed2ikNeiUE7zIVUiRRgdJ-fGjbJUAKxq8awxGZ.p8AMB9sstvctuYJB

tbursztyka commented 5 months ago

lets not spend too much time trying to make this work now impacting current (stable) API and the drivers implementing it.

@nashif Got it, the report was confusing. Just went through the link @carlescufi has provided, it's much clearer.

tbursztyka commented 5 months ago

@swift-tk

True, that may be the case 5 years ago. A lot of things have happened in 5 years in electronic business. Now categories of devices have grown really big in wearable, smartphone and IoT.

Ok, that's the answer I was looking for. Because as soon as there is more than just memory devices out there using these features, then yes it makes sense to make a common API for it despite the added overhead. (I note however, that during the arch presentation, only memory devices were discussed)

Then comes another question: what blocks the possibility to improve the existing SPI API to enable those features versus creating an entirely new API?

Well, it is like you said, there would be more overhead for those generic SPI devices that don't support thoes features. And the existing SPI API and MSPI drives different peripheral hardwares. I also think that it is impossible to not break the current SPI to achieve similar functions that the proposed MSPI does. Even if we somehow managed to maintain backward compability with SPI, there is probably going to be more overhead than bringing in a new API.

I tend to agree with @nashif as well here. Expanding existing SPI API is possible (actually my old proposal is quite obsolete anyway), but at a cost of more caution needed to not break anything etc... And indeed, in the end, perhaps we could reach a point - if relevant - to merge both into a unique one someday.

That said when you say

no support for multiple devices on the same controller instance.

That's completely not true. Of course existing API can handle multiple devices on the same controller instance. Have you spend enough time understanding the existing API ?

If I have to say, I particularly dislike the way that SPI passes spi_config when initiating a transfer. In my view, most of them are more device/hardware settings instead of transfer related things.

That was done for a reason: as soon as you have 2+ peripherals connected to the same SPI controller which by definition have no idea about each other, how does a peripheral A - when initiating a transfer - knows that the controller is properly configured for itself and not for a peripheral B (and vice-versa)?

Under the hood: the controller only has to re-configure if the given configuration pointer (these are meant to be constant) changes. It's fully transparent for the user in the end, and has no need to worry about configuration status. That basically enables multi devices managed by one controller driver btw.

It was this solution, or allocating a fixed array of spi_config pointers into the controller driver and setting them on relevant configure() calls. It was easier to do the above solution (DTS integration was very much at its beginning, there was almost nothing to know how much devs are hooked etc..). At the bottom of this, it anyway ends up being the same for the controller: it will have to configure/reconfigure itself if relevant prior to any transfer anyway.

swift-tk commented 5 months ago

@tbursztyka

That was done for a reason: as soon as you have 2+ peripherals connected to the same SPI controller which by definition have no idea about each other, how does a peripheral A - when initiating a transfer - knows that the controller is properly configured for itself and not for a peripheral B (and vice-versa)?

Under the hood: the controller only has to re-configure if the given configuration pointer (these are meant to be constant) changes. It's fully transparent for the user in the end, and has no need to worry about configuration status. That basically enables multi devices managed by one controller driver btw.

It was this solution, or allocating a fixed array of spi_config pointers into the controller driver and setting them on relevant configure() calls. It was easier to do the above solution (DTS integration was very much at its beginning, there was almost nothing to know how much devs are hooked etc..). At the bottom of this, it anyway ends up being the same for the controller: it will have to configure/reconfigure itself if relevant prior to any transfer anyway.

Understood, I clearly missed that as it is so ambiguous and I didn't find anyone using SPI for multiple devices. I'd like to point out that it is still a bad solution conventionally as an API should not be doing two different things all together. i.e. Switching between slave and starting a transfer. Guess it was just done easy than done right.

Anyway, I'm glad that we finally reached some consensus.

swift-tk commented 5 months ago

@tbursztyka @carlescufi May I ask what are the next steps for this RFC? Am I approved to implement or it need further discussion?

carlescufi commented 5 months ago

Architecture WG:

@tbursztyka @carlescufi May I ask what are the next steps for this RFC? Am I approved to implement or it need further discussion?

No objections at the Arch WG, missing an explicit ACK from @tbursztyka who is the maintainer

The PR should include:

Header file
At least one implementation of the new bus API (i.e. the hardware peripheral)
A device driver that uses the new bus API (e.g. a flash memory)
A test that runs under an emulator
Documentation (see https://docs.zephyrproject.org/latest/hardware/peripherals/index.html)

tbursztyka commented 5 months ago

* No objections at the Arch WG, missing an explicit ACK from @tbursztyka who is the maintainer

It's a new API, so nobody's ack is necessary here afaik.

carlescufi commented 5 months ago

* No objections at the Arch WG, missing an explicit ACK from @tbursztyka who is the maintainer
It's a new API, so nobody's ack is necessary here afaik.

By "ACK" I meant whether you are OK with adding a new API instead of extending the current one.

tbursztyka commented 5 months ago

By "ACK" I meant whether you are OK with adding a new API instead of extending the current one.

Yes as described by @nashif, it's better to have something out of the existing SPI API.

carlescufi commented 5 months ago

By "ACK" I meant whether you are OK with adding a new API instead of extending the current one.

Yes as described by @nashif, it's better to have something out of the existing SPI API.

Excellent, thanks @tbursztyka. @swift-tk feel free to proceed with the RFC.

firebladed commented 5 months ago

the original context in which i was thinking about this issue (see #51402) was related to conceptually using a bt81x connected to a Nordic nRF52840

using QSPI on the bt81x requires instructing the bt81x to switch from 1x mode to x2 or x4 mode by writing to a "REG_SPI_WIDTH" register which defaults to x1. so you have to write in x1 mode to then switch to x2/x4 mode

to me a new but functionally compatible MSPI Bus is fine, as long as you can identify what bus is attached to switch driver mode like do with dual SPI/I2C drivers

in relation to qspi and multiple devices as far as i am aware multiple devices on SPI is relatively standard, and i can see this sometimes being potentially necessary for QSPI as well

e.g. the nordic nRF52840 and a bt81x and e.g matter usage

when the internal flash storage on the nRF52840 is insufficient, external flash is required, and you want a qspi display in combination with this you need to connect both to the same QSPI Bus controller.

the nRF52840 QSPI module only has one CS so you cant connect multiple simultaneously internally

It looks like you can reconfigure the pin multiplexing to connect both (deactivate module. change pin, reactivate module) however the controller driver has to both know it would have to do this and access to the information necessary.

secondly the bus controller driver would have to know which mode each device is currently operating in as talking 4x to a 1x device and vice versa is unlikely to work and devices may have individual mode switching requirements.

im not sure what/how of the other SPI "modes" extend to QSPI

swift-tk commented 5 months ago

@firebladed

when the internal flash storage on the nRF52840 is insufficient, external flash is required, and you want a qspi display in combination with this you need to connect both to the same QSPI Bus controller.

the nRF52840 QSPI module only has one CS so you cant connect multiple simultaneously internally

It looks like you can reconfigure the pin multiplexing to connect both (deactivate module. change pin, reactivate module) however the controller driver has to both know it would have to do this and access to the information necessary.

secondly the bus controller driver would have to know which mode each device is currently operating in as talking 4x to a 1x device and vice versa is unlikely to work and devices may have individual mode switching requirements.

In your case, the device switching will have to be managed by software. struct mspi_dev_id is for this purpose. The controller would then know which device it should talk to and conduct switching when necessary. i.e. changing pins by pinctrl. And there should be no need to deactivate and reactivate the device as long as the controller implementation handles context switching and threading.

The controller driver does not need to know the device operating mode as long as the device drivers record their current operating mode and simply reconfigure the controller during device switching. But for the convience of debug, the controller driver may record the device specific settings (struct mspi_dev_cfg) for the current device that occupies the bus.

erwango commented 4 months ago

@swift-tk I think a mspi_is_xip would be required to be able to query xip state. For instance if xip is set by a bootloader, then in the application image, you need to know xip status for various reaons.

swift-tk commented 4 months ago

Hi @erwango That can surely be integrated to mspi_get_channel_status. We could have some enum for the returned status.

teburd commented 4 months ago

If the intent is for this API to be used by Zephyr Sensors moving forward than it likely needs to implement the rtio API. Are there sensors in mind that will use this?

swift-tk commented 4 months ago

I have created a draft PR here https://github.com/zephyrproject-rtos/zephyr/pull/71769 for the RFC. I invite anyone who is interested to take a first glance.

@carlescufi Does it need to be discussed again in AG meetings? Since I had to split into more than one PRs for these items.

The PR should include:

Header file At least one implementation of the new bus API (i.e. the hardware peripheral) A device driver that uses the new bus API (e.g. a flash memory) A test that runs under an emulator Documentation (see https://docs.zephyrproject.org/latest/hardware/peripherals/index.html)

erwango commented 4 months ago

Hi @erwango That can surely be integrated to mspi_get_channel_status. We could have some enum for the returned status.

Tbh, I'm not clear on the concept of channel on this API. What is a channel in the context of this API and how a client would know on which channel it should query status ? Anyway, let me check the drat PR :)

swift-tk commented 4 months ago

Hi @erwango That can surely be integrated to mspi_get_channel_status. We could have some enum for the returned status.

Tbh, I'm not clear on the concept of channel on this API. What is a channel in the context of this API and how a client would know on which channel it should query status ? Anyway, let me check the drat PR :)

That just a place holder for now, I saw some vendors used it. From what I understand, it is the controller instance number(for the actual hardware).

swift-tk commented 4 months ago

If the intent is for this API to be used by Zephyr Sensors moving forward than it likely needs to implement the rtio API. Are there sensors in mind that will use this?

From me, for the moment, I don't have any sensors in mind. Maybe @RichardSWheatley has some plans.

RichardSWheatley commented 4 months ago

If the intent is for this API to be used by Zephyr Sensors moving forward than it likely needs to implement the rtio API. Are there sensors in mind that will use this?

From me, for the moment, I don't have any sensors in mind. Maybe @RichardSWheatley has some plans.

No plans currently for me either.

erwango commented 4 months ago

If the intent is for this API to be used by Zephyr Sensors moving forward than it likely needs to implement the rtio API. Are there sensors in mind that will use this?

On STM32 at least, targetted controllers are not intended to be used with sensors but are focused on external memories.

teburd commented 4 months ago

Perfect, I wanted to ensure we didn't do lots of great new work only to turn around and note that it doesn't fit with the newer sensor API flows we are pushing forward

swift-tk commented 4 months ago

@teburd I've been looking at RTIO and rtio_sqeprep* function usages, I have a few questions. The 'rtio_sqe_prep_callback' seems to be always the last element of the entries in the software queue that makes up "a transfer". The user callback is called when the top of the queue reaches that element with OP_CALLBACK and all other entries get passed to the controller driver. This seems to imply that whatever transfer started before user callback being called had completed (by observing the wave) by the time the user callback is called.

If my understanding is correct, then it is not truly async and would not be compatible with a controller implementation that uses hardware command queue for async transfer calls but does not wait for the transfer to complete or handles the complete signaling.

To be compatible with what I'm proposing here, the rtio_executor_op would need to wait for a complete signal from a lower level callback registered using mspi_register_callback. Luckily, mspi_register_callback would only need to be called one time during initialization. Even with this, the advantage of the hardware queue has not been utilized fully as it would stop pushing to the hardware queue between two transfer calls(wait for the first call to complete and then push the second call). Unless the user callback is handled in a standalone queue(maybe in completion queue).

swift-tk commented 3 months ago

close as https://github.com/zephyrproject-rtos/zephyr/pull/71769 is merged.

zephyrproject-rtos / zephyr