kuznia-rdzeni / coreblocks

RISC-V out-of-order core for education and research purposes
https://kuznia-rdzeni.github.io/coreblocks/
BSD 3-Clause "New" or "Revised" License
33 stars 13 forks source link

Add pipelining support to LSU requester #695

Closed lekcyjna123 closed 1 month ago

lekcyjna123 commented 1 month ago

Here is a small refactor of the LSURequester it now support the request pipelining thanks to using the fifo. Additionally unit tests has to be updated, because after that change DummyLSU started to support reordering of miss-aligned instructions before the correct once.

Based on #696

github-actions[bot] commented 1 month ago

Benchmarks summary

Performance benchmarks

aha-mont64 crc32 minver nettle-sha256 nsichneu slre statemate ud
0.407 (0.000) 0.527 (0.000) 0.321 (0.000) 0.652 (0.000) 0.345 (0.000) 0.283 (0.000) 0.317 (0.000) 0.405 (0.000)

You can view all the metrics here.

Synthesis benchmarks (basic)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▼ 21866 (-601) ▲ 5569 (+8) ▼ 770 (-32) ▲ 1012 (+8) ▼ 48 (-1)

Synthesis benchmarks (full)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▼ 33465 (-213) ▲ 8811 (+8) 1932 (0) ▲ 1192 (+8) ▲ 42 (+2)
github-actions[bot] commented 1 month ago

Benchmarks summary

Performance benchmarks

aha-mont64 crc32 minver nettle-sha256 nsichneu slre statemate ud
0.407 (0.000) 0.527 (0.000) 0.321 (0.000) 0.652 (0.000) 0.345 (0.000) 0.283 (0.000) 0.317 (0.000) 0.405 (0.000)

You can view all the metrics here.

Synthesis benchmarks (basic)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▲ 23085 (+163) ▲ 5569 (+8) ▲ 802 (+32) ▲ 1012 (+8) ▼ 46 (-4)

Synthesis benchmarks (full)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▼ 32440 (-1665) ▲ 8811 (+8) ▼ 1932 (-32) ▲ 1192 (+8) ▼ 41 (-1)
github-actions[bot] commented 1 month ago

Benchmarks summary

Performance benchmarks

aha-mont64 crc32 minver nettle-sha256 nsichneu slre statemate ud
0.407 (0.000) 0.527 (0.000) 0.321 (0.000) 0.652 (0.000) 0.345 (0.000) 0.283 (0.000) 0.317 (0.000) 0.405 (0.000)

You can view all the metrics here.

Synthesis benchmarks (basic)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▲ 24084 (+1162) ▲ 5569 (+8) ▲ 802 (+32) ▲ 1012 (+8) ▼ 49 (-1)

Synthesis benchmarks (full)

Device utilisation: (ECP5) LUTs used as DFF: (ECP5) LUTs used as carry: (ECP5) LUTs used as ram: (ECP5) Max clock frequency (Fmax)
▼ 30477 (-3628) ▲ 8811 (+8) 1964 (0) ▲ 1192 (+8) ▼ 41 (-1)
tilk commented 1 month ago

No change in benchmarks, as Wishbone Classic doesn't support pipelining.

lekcyjna123 commented 1 month ago

No change in benchmarks, as Wishbone Classic doesn't support pipelining.

Yes, I have expected that, but I started the benchmark to make sure that there is no regression.