Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
So why would anyone replace the easy-to-use PyBind11 with almost 2,000 lines of pure CPython bindings?! Of course, to lower the latency! PyBind11 wraps every C++ object with a smart pointer, puts a hash table next to it, and addresses function pointers with std::string key lookups 🤯
Let's see where it gets us if benchmarking with the "Leipzig1M" dataset. The bandwidth-oriented functions are just as fast as in the past:
Hashing the dataset: 77 ms vs 16 ms. 4.5x faster
Counting the number of "the": 151 ms vs 45 ms. 3.3x faster
Split all whitespace-delimited words: 782 ms vs 338 ms. 2.3x faster
Split around every "the": 240 ms vs 48 ms. 5x faster
What about the latency-oriented ones?
Find the first whitespace: 1ns vs 3ns 3x slower
Partition around the first whitespace: 73ms vs 33ns. 2212x faster
So why would anyone replace the easy-to-use PyBind11 with almost 2,000 lines of pure CPython bindings?! Of course, to lower the latency! PyBind11 wraps every C++ object with a smart pointer, puts a hash table next to it, and addresses function pointers with
std::string
key lookups 🤯Let's see where it gets us if benchmarking with the "Leipzig1M" dataset. The bandwidth-oriented functions are just as fast as in the past:
What about the latency-oriented ones?