Wikinaut / agrep

AGREP - approximate GREP for fast fuzzy string searching. Files are searched for a string or regular expression, with approximate matching capabilities and user-definable records. Developed 1989-1991 by Udi Manber, Sun Wu et al. at the University of Arizona. ISC open source license since Sept. 2014.
Other
299 stars 50 forks source link

32 characters limit: 64bit? #21

Open jbarth-ubhd opened 1 year ago

jbarth-ubhd commented 1 year ago

Would it be possible to surpass the ~32 characters limit of agrep by using 64 bit unsigned long instead of unsigned?

Tried a bit with using unsigned long and doubling some agrep.h defines and replacing (unsigned)037777777777 to the 64 bit equivalent etc. But didn't work.

Wikinaut commented 1 year ago

As far as I know (since many years), this appears to be impossible, unfortunately!

jbarth-ubhd commented 1 year ago

Just tried it with the Fuzzy searching example code (substitutions only) here: https://en.wikipedia.org/wiki/Bitap_algorithm : 64 bit unsigned long works with 63 character long queries.

Wikinaut commented 1 year ago

Ah, sounds good. However please keep in mind, that agrep uses (offers to use) a circus of algorithms, so for this case it might be valid.

Looking forward to a(your) formal patch/pull request; please as a distinct feature branch, so that others can decide what to use - just in case.

Thanks for your message/issue.

Wikinaut commented 1 year ago

PS. off-topic

Just found your perspective-correction scripts, I am also trying to find a working version of ShiftN.

See https://github.com/Wikinaut/shiftn and https://github.com/Wikinaut/Perspective-Transformation

jbarth-ubhd commented 1 year ago

quote from https://github.com/heyimalex/bitap : »Pattern size is limited to system word size (mem::size_of::() - 1), so you can't search for anything longer than 31/63 characters, depending on architecture.«