chipsalliance / dromajo

RISC-V RV64GC emulator designed for RTL co-simulation
Apache License 2.0
210 stars 63 forks source link

Optimize ctz #45

Closed jserv closed 2 years ago

jserv commented 3 years ago

Use intrinsic ctz for bitcounting. While ctz is not available, use branchless implementation instead.

et-tommythorn commented 3 years ago

Thanks. Could you add a unit test under an #ifdef to verify that it implements the same behavior as the original? Running through all 32-bit numbers is totally acceptable and at most takes a few minutes on a modern processor.

jserv commented 3 years ago

Thanks. Could you add a unit test under an #ifdef to verify that it implements the same behavior as the original? Running through all 32-bit numbers is totally acceptable and at most takes a few minutes on a modern processor.

Which directory/file should I put the clz specific unit test?

et-tommythorn commented 3 years ago

We don't have an established practice, but perhaps create a tests toplevel directory, sibling to src and include? Or if more convenient, it could be a subdirectory of src, but I suspect that might create its own issues.

et-tommythorn commented 2 years ago

Thanks for your change. Sorry for the long delay, but looking it over again it seems fine and I'll pull.