SpinalHDL / VexRiscv

A FPGA friendly 32 bit RISC-V CPU implementation
MIT License
2.4k stars 405 forks source link

VexRiscv DCache memory data width #134

Open lockkkk opened 4 years ago

lockkkk commented 4 years ago

Hi, I notice that in the smp branch, the VexRiscv supports larger memory data width (e.g., 128 bits) which is bigger than the cpu data width (32 bits). However, in that branch, the cpu_dBus_cmd_payload_data's data width is still 32 bit, which means the core can only write the on-chip ram 32 bits every cycle.

For example, I modified the master code according to commit b0f7f3. If I config the memory data width to become 512 bit, I will get an error in https://github.com/SpinalHDL/VexRiscv/blob/b0f7f37ac8b3ca9221e1d176b8d1d893b3cbd9f3/src/main/scala/vexriscv/ip/DataCache.scala#L267, which tries to assign a 32-bit dBus_cmd to a 512-bit axi wire.

Can we avoid this error? Does current VexRiscv support DBus with larger data width in both TX and RX? If not, do you have any suggestions about implementing this?

Thanks a lot!

Dolu1990 commented 4 years ago

Hi,

smp branch

I recently merged smp in dev. From that point all will continue in dev ^^

the VexRiscv supports larger memory data width (e.g., 128 bits) which is bigger than the cpu data width (32 bits)

Right

the cpu_dBus_cmd_payload_data's data width is still 32 bit, which means the core can only write the on-chip ram 32 bits every cycle.

Ahhh you mean the AXI bridges. Right, only the BMB bridge was updated :)

, which tries to assign a 32-bit dBus_cmd to a 512-bit axi wire. Can we avoid this error?

The reason is VexRiscv data cache is write through, so, the data written are never bigger than 32 bits. I keept the interface as pure as possible to have more flexibility in the way to bridge it with system busses.

To bridge it, you can just replicate the 32 bits on the whole length of the bus :

axi.writeData.data.subdivideIn(32 bits).foreach(_ := dataStage.data )

That should be fine then.

Does current VexRiscv support DBus with larger data width in both TX and RX? If not, do you have any suggestions about implementing this?

So, the above solution isn't the best for DRAM performances, idealy, we would need some forms of write buffering / aggregation.

That's what i have done for the BMB bridge.