enjoy-digital / litepcie

Small footprint and configurable PCIe core
Other
487 stars 119 forks source link

DMAReader not working after update to 2023.12 #125

Closed smunaut closed 10 months ago

smunaut commented 10 months ago

I just updated to 2023.12 (from 2023.4) and AFAICT the DMAReader isn't working. It starts transferring and then just stops after only a few words.

dump.zip

I've added a trace looking at the port of the DMAReader. Here are the analyzed signals :

                self.pcie_dma0.reader.port.source.channel,
                self.pcie_dma0.reader.port.source.user_id,
                self.pcie_dma0.reader.port.source.first,
                self.pcie_dma0.reader.port.source.last,
                self.pcie_dma0.reader.port.source.we,
                self.pcie_dma0.reader.port.source.adr,
                self.pcie_dma0.reader.port.source.req_id,
                self.pcie_dma0.reader.port.source.tag,
                self.pcie_dma0.reader.port.source.len,
                self.pcie_dma0.reader.port.source.ready,
                self.pcie_dma0.reader.port.source.valid,

                self.pcie_dma0.reader.port.sink.ready,
                self.pcie_dma0.reader.port.sink.valid,
                self.pcie_dma0.reader.port.sink.first,
                self.pcie_dma0.reader.port.sink.last,

                self.pcie_dma0.source.valid,
                self.pcie_dma0.source.ready,

Here's also lspci -vv from both the device and the pcie port :

lspci_port.txt lspci_dev.txt

So something that I find weird is that the Max Read Request don't match between the two :

Device :

        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 1024 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.075W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes

PCIE controller:

                        ExtTag- RBE+
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 256 bytes, MaxReadReq 128 bytes 

And in the trace it seems the DMA reader is issuing read requests of 512 bytes (looking at how much the address increment ... but then again the size is 0x80 which would be 128 so I'm confused unless the size is in 32 bits works).

Target is a Xilinx USP (ez11eg) PCIe gen3 x8

smunaut commented 10 months ago

Tried reverting to 5e3383c (which is the last commit before the "cleanup") and this works as expected.

So definitely something got broken ...

smunaut commented 10 months ago

Checked 76c7381ad8d7eab01020fd9bbe0249166535f2a8 and it fails (it's the merge commit of the "cleanup").

I'm looking at the phy req_sink and what's sent seems similar in both cases, but what comes back is not :/

smunaut commented 10 months ago

dumps.zip

smunaut commented 10 months ago

For some reason the straddle option is turned on in uspciephy.py and usppciephy.py but the conversion logic doesn't support that mode of operation at all ...

 "AXISTEN_IF_RC_STRADDLE"       : True,

Turning that off fixes my issue.

I'm not sure how it works for anyone with that enabled ...

smunaut commented 10 months ago

My bad ... this fix is not complete. It's one problem but 526a3d05aafcf8877b7cb9f893de82f3bd85a6e4 also breaks it ...

smunaut commented 10 months ago

I think that commit hasn't been tested when address_width == 32. Or when on zynq mp ( device ID == xczu )

enjoy-digital commented 10 months ago

xczu is maybe missing here: https://github.com/enjoy-digital/litepcie/blob/600bd13d5d5adabf452fb1bdf547475134de7157/litepcie/tlp/packetizer.py#L807

smunaut commented 10 months ago

Yes, it is.

But that alone doesn't solve the problem. If I set address_width = 64 (and add xczu), then it works. But in 32 bit mode it doesn't. The address ends up wrong on what's sent to the core ( upper and lower 32 bits are swapped ).