Closed kloetzl closed 5 years ago
Thanks. What happens if a character with byte value 64 is included... do you expect your approach to work?
Unless I am missing something, that will work just fine. Consider the string "@b c"
aka. 0x40622063
. After the shuffle that becomes 0x20002000
. Comparing that to the original string gives 0x0000FF00
.
It is basically a simplification over what you use in simdjson that works as long as all characters of interest have differing low nibbles.
That's a neat trick!
btw, here is the despacer I was fiddling with last week https://gist.github.com/aqrit/6e73ca6ff52f72a2b121d584745f89f3#file-despace-cpp-L141
I'm sold, merging.
That's a neat trick!
Thanks! And thanks for merging! I haven't yet tried to extend the method to any of the other functions. Might be worth it.
The original comparison can be improved by shaving off a few instructions and cycles. The unit tests run through. A similar method could also be used for most other functions, given my method is correct and I didn't miss an edge case.
ps. Thanks for this repo. The code is really helpful!