Closed ghost closed 4 years ago
Thanks for the feedback. Those calls should be isprint(pmh_[i])
and isprint(pma_[i])
as you point out. The problem happens when you're debugging the DFA construction, but not otherwise fortunately.
To be more precise, this part of the code constructs hash-based match predictors to speed up find
by searching the input before a DFA match is attempted. The scan
operation used in scanners doesn't use or need this feature.
I have experimented with several ways to speed up searching, by creating a set of tests to evaluate Boyer-Moore, Aho-Corasick, Bitap, and hashing-based algorithms. BM and AH are slow, but BM does beat memchr
in some cases which is evaluated with a threshold function that picks memchr
or BM. If memchr
is picked, it uses the least frequently used character to search to speed this up further.
Ah I see, well thankfully it isn't a major issue and the fix is simple so I'll leave it at that.
Thank you.
This problem will be fixed in the upcoming release as this issue shouldn't happen of course.
Version 1.5.6 is available with the fix.
When using the current release version 1.5.5, my program regenerates debug assertion errors when building under Visual C++ (have not tested other compilers/platforms).
The issue is coming from
isprint()
which is used inside of this block of code below.The
_chvalidator()
function which I assume is called at some point inside ofisprint()
includes the following line:_ASSERTE(c >= -1 && c <= 255);
.I don't think
isprint(i)
was intended, sinceConst::HASH
is 4096 and the loops would go over 255 regardless. I triedisprint(pmh_[i])
andisprint(pma_[i])
which does get rid of errors.