Deniz-Eren / dev-can-linux

Porting of Linux CAN-bus drivers to QNX
GNU General Public License v2.0
4 stars 1 forks source link

PCIe 0x05 (MSI) and 0x11 (MSI-X) capability support #7

Closed Deniz-Eren closed 8 months ago

Deniz-Eren commented 1 year ago

Add PCIe capability ID 0x05 (MSI) support

Issue discovered regarding cards with PCIe capability 0x05 (MSI). Looking at dev-can-linux -vvvvv output you can check if a device supports the 0x05 (MSI) capability:

read capability[0]: 1
read capability[1]: 5     <---- THIS ONE
read capability[2]: 10

When using PCIe cards, we found (on an equivalent Linux platform) that when the MSI capability IRQs are available and when the driver does not use them, some issues arise. From what we can tell, at rare occasions the IRQ event is received before the chipset has data available to read. The fix is believed to be the use of the capability 0x05 (MSI) by the driver. The issue is very rare and can only be detected at heavy traffic testing conditions.

Deniz-Eren commented 9 months ago

Latest testing (commit 2bdb6bf52d26fa96d6760a117f4aa5460535c5c3) shows that feature https://github.com/Deniz-Eren/dev-can-linux/issues/4 is required before this PCIe 0x05 (MSI) capability support can be utilized. Current version only supports a single IRQ to be allocated for a device so the driver aborts when MSI feature request 8 in this case.

[QNXTEST]#dev-can-linux -vvvvv
dev-can-linux v1.0.27 (commit 2bdb6bf52d26fa96d6760a117f4aa5460535c5c3)
dev-can-linux comes with ABSOLUTELY NO WARRANTY; for details use option `-w'.
This is free software, and you are welcome to redistribute it
under certain conditions; option `-c' for details.
driver start (version: 1.0.30)
Auto detected device (13fe:d7) successfully: (driver "adv_pci")
initializing device 13fe:00d7
pci_enable_device: 13fe:d7
read ssvid: 13fe
read ssid: d7
read cs: 0, slot: 0, func: 0, devfn: 0
read ba[0] MEM { addr: df102000, size: 800 }
io threshold: 0; I/O[0:0], MEM[df102000:df102800]
read ba[1] MEM { addr: df101000, size: 80 }
io threshold: 0; I/O[0:0], MEM[df101000:df102800]
read ba[2] MEM { addr: df100000, size: 80 }
io threshold: 0; I/O[0:0], MEM[df100000:df102800]
read capability[0]: 1
read capability[1]: 5
nirq: 8
capability 0x5 (MSI) enabled
read irq[0]: 258
read irq[1]: 259
read irq[2]: 260
read irq[3]: 261
read irq[4]: 262
read irq[5]: 263
read irq[6]: 264
read irq[7]: 265
read multiple (8) IRQs
Deniz-Eren commented 9 months ago

Alternative to https://github.com/Deniz-Eren/dev-can-linux/issues/4 would be to use cap_msi_set_nirq function and request 1 IRQ only. This feels like a better idea given the downstream drivers are only expecting to use 1 IRQ.

Deniz-Eren commented 9 months ago

Experiments with the use of cap_msi_set_nirq function and request 1 IRQ seems to imply only 1 of the 8 IRQs are handled.

Initially I expected that asking for 1 IRQ would mean the OS would report all 8 IRQs with the single IRQ requested. Experiments were done with temporary implementation of MSI feature using QEmu. If a single MSI IRQ is implemented via QEmu the single IRQ requesting driver worked perfectly. Thus meaning asking for a single IRQ looks like we are not handling the other 7 IRQs rather than all IRQs being represented by the one requested.

When attempting to implement 8 MSI IRQs with QEmu, I ran into issues and limitations with QEmu and multiple MSI IRQs. MSI requires a contiguous block of vectors and there's limited support in QEmu for actually making use of more than a single vector. Also, there is almost no real hardware that doesn't implement MSI-X for multiple vector support except for the CAN-bus cards we have encountered. Both the Advantech and the Peak case have capability 0x05 (MSI) and do not have capability 0x11 (MSI-X) available.

For example:

B003:D00:F00 @ idx 27
        vid/did: 13fe/00d7
                <vendor id - unknown>, <device id - unknown>
        class/subclass/reg: 0c/09/00
                CANbus Serial Bus Controller
        revid: 0
        cmd/status registers: 7/10
        Capabilities list (3):
                     01 (PMI) --> 05 (MSI) --> 10 (PCIe)
        Address Space list - 3 assigned
            [0] MEM, addr=df102000, size=800, align: 800, attr: 32bit CONTIG ENABLED
            [1] MEM, addr=df101000, size=80, align: 80, attr: 32bit CONTIG ENABLED
            [2] MEM, addr=df100000, size=80, align: 80, attr: 32bit CONTIG ENABLED
        Interrupt list - 0 assigned
        hdrType: 0
                ssvid: 13fe  ?
                ssid:  00d7

        PCIe Capability Details
                PCIe port Type: [0] (EP) Endpoint Device
                PCIe Extended Capabilities (1):
                     03 (DEV S/N v1)

        PMI Capability Details
                Module Does Not Exist [(PCI_ERR_NO_MODULE)]

        Device Dependent Registers
                [040] 00000000  00000000  00000000  00000000  
                [050] 00037001  00000000  00000000  00000000  
                [060] 00000000  00000000  00000000  00000000  
                [070] 00869005  00000000  00000000  00000000  
                [080] 00000000  00000000  00000000  00000000  
                [090] 00010010  00008000  00002010  0043f811  
                [0a0] 10110040  00000000  00000000  00000000  
                [0b0] 00000000  00000000  00000000  00000000  
                  :
                [100] 00010003  00000000  00000000  00000000  
                [110] 00000000  00000000  00000000  00000000  
                  :
                [ff0] 00000000  00000000  00000000  00000000

Furthermore, the use of MSI or MSI-X is looking like will require multiple IRQ support after all.

Thus the best way forward is to implement MSI-X in the driver to get QEmu emulation to work (with the creation of a hypothetical card with MSI-X) but also for better device operation for potential future cards. This version can be used to implement and test multi-IRQ support feature, then real hardware can be used to test MSI only (without MSI-X) devices.

Deniz-Eren commented 9 months ago

https://github.com/Deniz-Eren/dev-can-linux/issues/19

Deniz-Eren commented 8 months ago

QEmu MSI and MSI-X features being drafted up in forked repository branch feature/can-sja100-pci-msi-support.

Deniz-Eren commented 8 months ago

Multiple IRQ support now implemented and rebased to pull request https://github.com/Deniz-Eren/dev-can-linux/pull/8

MSI-X support https://github.com/Deniz-Eren/dev-can-linux/issues/19 will be implemented and tracked to completion within this ticket also.

Deniz-Eren commented 8 months ago

Pending real hardware validation

Deniz-Eren commented 8 months ago

Working solution has been achieved with new MSI-X support for QEmu and experimentation with QNX MSI-X masking functions. Work in progress to commit the fix.

For MSI-X, individual IRQ masking and unmasking has yielded a good working solution when using cap_msix_mask_irq_entry() and cap_msix_unmask_irq_entry() when masking individual IRQs.

At this stage it is unknown if InterruptMask() or InterruptUnmask() is needed in addition to MSI/MSI-X masking functions, however it sounds like those are mutually exclusive to each other; to be verified.

With MSI support, when using our QEmu experimental support implementation, and when using cap_msi_mask_irq_entry() and cap_msi_unmask_irq_entry(), we get error PCI_ERR_ENOTSUP (Requested Operation Is Not Supported) because the QEmu PCI bus seems to be legacy and doesn’t support Per Vector Masking (PVM); thus real hardware testing is needed.

However, we would like to support PCI bus devices that do not have Per Vector Masking (PVM) also, so further experimentation is needed with other masking functions. It sounds like it should be possible to mask and unmask all of the allocated interrupt vectors together when one interrupt occurs. At least the error implies that if per vector masking isn’t allowed then bulk vector masking must be the alternative and legacy method.

The other item to investigate is the earlier conclusion that we must support multiple IRQs, when testing with cap_msi_set_nirq(). It may be that the masking not being done correctly lead us to this conclusion. Definitely the MSI-X system has support to share and redirect IRQs to a single or subset of IRQs using Interrupt disposition. Therefore, for both MSI and MSI-X this option should be investigated and if it works, then the driver must implement options for the user to be able to configure their desired solution.

Deniz-Eren commented 8 months ago

Current commit now has all fixes necessary for MSI-X operation verified with QEmu to work end-to-end.

MSI pending real hardware verification, since QEmu seems to not support PCI 3.0; instead these legacy features have been leapfrogged to support PCIe MSI-X.

That is, legacy MSI in PCI 3.0 onwards allow each interrupt to be masked individually. If this feature is not available on the device, that is, if Per Vector Masking (PVM) isn't supported, then currently the driver will revert to regular IRQ operation.

If we can figure out a way to mask and unmask non-PVM MSI hardware (pre-PCI 3.0) then we can support those legacy devices with MSI also (Capability ID 0x5 (MSI)). Until then such devices will have to do with regular IRQ support only.