intel / hyperscan

High-performance regular expression matching library
https://www.hyperscan.io
Other
4.71k stars 705 forks source link

Can parameters limitPatternLength and limitLiteralCount be increased? #429

Open rongekuta opened 3 months ago

rongekuta commented 3 months ago

i prepare to use hyperscan to match patterns like: 1) .abc{dict1}abc 2) .123{dict2}123

dict1 and dict2 are big string set (stored in file), like: dict1: jack rose mike ...

dict2: tom jerry lily ...

i transform my patterns by expanding strings of dict as: 1) .abc(jack|rose|mike|...)abc 2) .123(tom|jerry|lily)123

because dict1 and dict2 is so large (>20MB) so transformed pattern is too large, so hyperscan raise error: "Pattern length exceeds limit"

beloved source code is : https://github.com/intel/hyperscan/blob/master/src/hs.cpp#L538 https://github.com/intel/hyperscan/blob/master/src/grey.cpp#L148

so can parameters limitPatternLength and limitLiteralCount be increased to match a large pattern with a dict?

rongekuta commented 3 months ago

i modify limitPatternLength and limitLiteralCount to 1MB, and use large pattern to compile, and found hyperscan compile is too slow