ashvardanian / StringZilla

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
https://ashvardanian.com/posts/stringzilla/
Apache License 2.0
2.05k stars 66 forks source link

Fix: correct and minimize some utility functions. #62

Closed kmapb closed 8 months ago

kmapb commented 8 months ago

Refactor sz_size_log2i to make it testable. Test it. Fix the bugs uncovered. Miscalculations here were why large strings occasionally segv.

The larger unit test still don't succeed, but they now fail semantically: a copied string does not have distance 0 from its copy source. In the interest of keeping the diff manageable, considering this PR worth submitting though, and we can continue debugging edit distance later.