finalfusion / finalfrontier

Context-sensitive word embeddings with subwords. In Rust.
https://finalfusion.github.io/finalfrontier
Other
87 stars 4 forks source link

Add dot product benchmarks, improve performance #155

Closed danieldk closed 3 years ago

danieldk commented 3 years ago

Benchmarks below:

dot_avx                 time:   [38.051 ns 38.076 ns 38.108 ns]                     
                        change: [-2.9680% -2.8665% -2.7561%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe

dot_fma                 time:   [34.151 ns 34.166 ns 34.179 ns]                     
                        change: [-38.371% -38.312% -38.257%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

dot_sse                 time:   [80.847 ns 80.862 ns 80.882 ns]                    
                        change: [-4.5626% -4.5090% -4.4474%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 16 outliers among 100 measurements (16.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  7 (7.00%) high mild
  6 (6.00%) high severe

dot_unvectorized        time:   [338.73 ns 338.79 ns 338.84 ns]                             
                        change: [+0.7094% +0.7583% +0.8104%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe
danieldk commented 3 years ago

Clippy emits a lot of

unsafe function's docs miss `# Safety` section

Now that the raw vectorized dot products are public (for benchmarking). I think many of these functions are provably safe. I'll update the PR tonight.

sebpuetz commented 3 years ago

I'll take a look later today, too! I gave it a quick glance earlier and on my machine there's also nice improvements :)

There was some superfluous unsafe block involved in one of the benches too, think it was the unvectorized one!