Open vasily-v-ryabov opened 2 months ago
OK, maybe the root cause is at higher level, not exactly in this function. I couldn't figure out the root cause yet.
Interesting, which CPU model are you running on?
Oh, it must be a binding issue. When I avoid passing the extra arguments I get:
>>> print("Hello, world!".find("world")) # original CPython algorithm
7
>>> print(Str("Hello, world!").find("world")) # StringZilla algorithm
7
So should be coming from python/lib.c
, @vasily-v-ryabov.
I run on different Intel CPUs: Core i7-8700 or Core i7-12700H.
Also I see similar issue in C code.
Wow, that's dangerous! Any chance there is an existing test you can extend in scripts/test.cpp
to highlight that? PRs always welcomed!
Describe the bug
StringZilla incorrectly finds this case
"Hello, world!".find("world", 0, 11)
which must return -1.Function
sz_equal_serial
doesn't handle correctly the suffix of substring"world"
(suffix == "d"
). It goes exactly to last statementreturn (sz_bool)(a_end == a);
wherea_end == "!"
anda == "!"
which is very strange because at the first line of the function there is an assignmentsz_ptr_t const a_end = a + length;
wherelength == 1
in debugger.It looks like a while loop before
return
statement is written incorrectly for this case.b++
(b was empty"\0"
string before entering the loop) remains empty randomly because subsequent memory is also clean (zeroes).I'm aware about #72 though it is just a wish for optimization.
Steps to reproduce
Simple to test it in Python:
and
Expected behavior
StringZilla version
3.8.4
Operating System
Ubuntu 22.04 and Windows 10/11 64-bit
Hardware architecture
x86
Which interface are you using?
Python bindings
Contact Details
No response
Are you open to being tagged as a contributor?
.git
history as a contributorIs there an existing issue for this?
Code of Conduct