enjoy-digital / litex

Build your hardware, easily!
Other
2.99k stars 569 forks source link

Wishbone bus high latency in read on Vexriscv CPU "minimal" variant and read/write on others CPU. #1863

Open mgaggero opened 10 months ago

mgaggero commented 10 months ago

Hello, I'm currently working with LiteX on a Muselab IceSugar board equipped with a Lattice iCE40U5k FPGA. I've encountered an unexpectedly high latency when reading from the Wishbone bus. I've implemented a 32-bit counter that increments every clock cycle and can be reset and read.

What I've observed is that two consecutive reads show a difference of approximately 400 clock cycles when using the Vexriscv CPU with the "minimal" variant. Interestingly, the "lite" variant doesn't exhibit this latency. When employing other CPUs such as picorv32 and femtorv, both read and write operations result in a consistent 270 cycles latency.

This latency issue appears to be related to the Wishbone bus, as the Vexriscv32 "minimal" variant, picorv32, and femtorv all exhibit the same behavior.

Dolu1990 commented 10 months ago

Hi,

What memory system are you using ? some XIP execution from flash ?

mgaggero commented 10 months ago

Absolutely, according to my understanding, the code is executed directly from the Quad SPI flash. There is no copying of program data to the SRAM. It's important to highlight that the Vexriscv 'minimal' variant lacks an instruction/data cache.

Dolu1990 commented 10 months ago

I think the reason why picorv32, and femtorv are faster in that specific case, is because is pipelined => fetch stuff in advance, which endup being wrong sometime (branch missprediction) meaning it will trash some fetch bandwidth,