Open Yawning opened 3 years ago
Hmm, this is a hard one, thank you for trying it out. I value the readability of those functions a lot, for both maintainability and education purposes. I'll make a PR, but probably won't merge it and will instead use it to push golang/go#29571. Hopefully by Go 1.18 it won't be a problem anymore.
this only impacts non-amd64/arm64 (due to dedicated assembly)
Note that the arm64 assembly is just a tiny carryPropagate
core, not the full Square
and Multiply
.
I'm not sure how bad the compiler behavior is on non-amd64 (due to lack of access to targets), and this only impacts non-amd64/arm64 (due to dedicated assembly), but https://github.com/golang/go/issues/29571 is (also) costing you a good amount of performance.
Unfortunately having giant walls of
math/bits
calls is less readable that the wrapper type, so depending on how you want to balance "readability" vs "going fast" this might be ok.nb: Only did one iteration on an amd64 target with go 1.17beta1 + purego, so there's some noise in the comparison, but the difference is statistically significant and noticeable.