finalfusion / finalfrontier

Context-sensitive word embeddings with subwords. In Rust.
https://finalfusion.github.io/finalfrontier
Other
87 stars 4 forks source link

Add dot product implementation using fused multiply-add (FMA) #113

Closed danieldk closed 4 years ago

danieldk commented 4 years ago

This adds a variant that uses the FMA intrinsic when the machine supports it. On my machine it's neither slower nor faster, but my modest i5 might be constrained by cache size and memory speed at this point ;).

twuebi commented 4 years ago

With fma: 12.8s, 13.7s, 12.8s Without fma: 12.65s, 12.86s, 13.2s

on tdz-train

danieldk commented 4 years ago

Thanks! I'll merge this, since it doesn't reduce performance and increases precision. I'll see if I can make some changes to improve pipeline use.