Open rmharris opened 1 year ago
@rmharris Thanks for pointing this. hs_expression_lit_info() is absent. Will fix that.
I've also been bitten by this, as before it was possible to search for nul bytes using nul literals (\u0000
) in literal expressions (compile_lit
& compile_lit_multi
), and now that is not possible anymore, and you have to resort to escaped nuls (\\x{00}
) and normal expressions (compile
& compile_multi
) to be able to do that.
So, I'm glad to know this will be fixed.
BTW, @hongyang7, maybe the chance can be taken to clarify compile_lit
and compile_lit_multi
doc (https://intel.github.io/hyperscan/dev-reference/api_files.html#c.hs_compile_lit). As it currently stands, it first says that expression
will be NULL-terminated:
* @param expression
* The NULL-terminated expression to parse.
[...]
But laters says even more explicitly that NULs are allowed within expression:
* @param len
* The length of the text content of the pure literal expression. As the
* text content indicated by @p expression is treated as single character
* one by one, the special terminating character `\0` should be allowed
* to appear in expression, and not treated as a terminator for a string.
* Thus, the end of a pure literal expression cannot be indicated by
* identifying `\0`, but by counting to the expression length.
That seems contradictory, and IMO it would be best to change expression doc to make it clear that NULL-termination is not needed.
Thx everybody involved for this great project.
@elmart-devo Good point. Thanks.
The documentation for hs_compile_lit() states that an expression must be NULL-terminated even though its length must be supplied. Looking though
hs_compile_lit_multi_int()
I see that nothing after the declared length is used by https://github.com/intel/hyperscan/blob/64a995bf445d86b74eb0f375624ffc85682eadfe/src/hs.cpp#L362-L364 However, there follows a call to https://github.com/intel/hyperscan/blob/64a995bf445d86b74eb0f375624ffc85682eadfe/src/hs.cpp#L373-L374 then https://github.com/intel/hyperscan/blob/64a995bf445d86b74eb0f375624ffc85682eadfe/src/parser/logical_combination.cpp#L143-L144 and finally https://github.com/intel/hyperscan/blob/64a995bf445d86b74eb0f375624ffc85682eadfe/src/hs.cpp#L530-L536 Given that the documentation explicitly allows for\0
to appear in the expression it seems like there is some further bug here, perhaps the absence of ahs_expression_lit_info()
.See also #205.