enjoy-digital / litepcie

Small footprint and configurable PCIe core
Other
487 stars 119 forks source link

CrossClockDomain FIFO depth too shallow causes inefficiencies #104

Closed smunaut closed 1 year ago

smunaut commented 2 years ago

In PHYTXDatapath and PHYRXDatapath the Async FIFO used to cross from sys to pcie domain is only 4 deep.

In my case with 200 MHz sys and 250 MHz pcie clock, the TX FIFO would become full pushing continuously on the sys side before the pcie side had a chance to read anything (despite being in a faster clock domain). This causes some back pressure in the sys domain even though the pcie domain is faster and is always ready to accept data.

I increase the FIFO depth to 16 in both PHYTXDatapath and PHYRXDatapath and that boosted my DMA speed. I went from 37.45 Gbit/s to 45.5 Gbit/s (on a gen3 x8 link) from just that change, which is pretty significant.

enjoy-digital commented 1 year ago

Thanks @smunaut, FIFO depths have been increased to 16 with https://github.com/enjoy-digital/litepcie/commit/5a7682a5a9426551c643f53bd93a82f069b1bd2b.