enjoy-digital / litepcie

Small footprint and configurable PCIe core
Other
468 stars 116 forks source link

PCIe issues on ADRV2CRR-FMC #88

Closed smunaut closed 2 years ago

smunaut commented 2 years ago

Issue

Trying to bring up PCIe (gen3 4x and gen3 8x) on this board yielded some unexpected issues and it took some time to find a sequence that works.

I'm documenting here the observations, the theory about what I think the problems are and workarounds.

Test Setup

First description of the setup :

Initial Observation ( Feb 22 )

Theories about problems ( Feb 23 )

Following more testing the next day, I think there are several problems and that's why the symptoms are weird and the different cases results make little sense.

smunaut commented 2 years ago

So I tried tracing the various state it goes through during several events :

Initial configuration ( with PC already booted )

Trigger retrain ( set retrain bit on the root port bridge )

Disable link : ( set disable link bit on the root port bridge )

Enable link : ( clear disable link bit on the root port bridge )

Reboot

This can end up in one of two scenarios :

or :

In both cases, I can't get it to change state ever again, no matter what bits I try to poke on the root bridge (setting it to sleep, disabling/re-enabling link, request retrain, ...)

smunaut commented 2 years ago

Other interesting result is I tried commenting out https://github.com/enjoy-digital/litepcie/blob/master/litepcie/phy/usppciephy.py#L97

Idea is that I can imagine some of the logic wants to see a clock when reset is asserted.

And that seems to reliably allows the FPGA to be detected through a reboot !

It's not all perfect though, because when I do that, it seems I can no longer dynamically reload a bitstream when the machine is booted :/ The PCIe core then follow that sequence during a "dynamic" reload :

And I can't get it out of it. (even manually asserting the pcie_rst_n on the core just makes it go through the same sequence). If I reboot the machine, it will train and work just fine though.

This might be something to do with the bios programming something differently when the card is detected at boot and when it's not that prevents a dynamic reload.

smunaut commented 2 years ago

Closing as I don't think the remaining weirdness is LitePCIe related and the removal of the clock gating on reset fixed most of them.