intel / hyperscan

High-performance regular expression matching library
https://www.hyperscan.io
Other
4.71k stars 705 forks source link

Combo pattern fails when operands set singlematch flag and/or max_offset #430

Open forgittable opened 2 months ago

forgittable commented 2 months ago

The following test cases show unintuitive behavior when running on "buffer": b"0123"

  1. "patterns": [ "/(1 & 2)/C", "/01/H", "/23/H", ], "result": [1, 2]
  2. "patterns": [ "/(1 & 2)/HC", "/01/H{max_offset=2}", "/23/H", ], "result":[1, 2]
  3. "patterns": [ "/(1 & 2)/C", "/01/H{max_offset=2}", "/23/H", ], "result": [1, 2]

Expected behavior:

"patterns": [ "/(1 & 2)/HC", "/01/H", "/23/H", ], "result": [1, 2, 0],

forgittable commented 2 months ago

Proposed fix:

diff --git a/src/rose/stream.c b/src/rose/stream.c
index 26268dd..33e29db 100644
--- a/src/rose/stream.c
+++ b/src/rose/stream.c
@@ -534,16 +534,20 @@ int can_never_match(const struct RoseEngine *t, char *state,
         return 0;
     }

     if (mmbit_any(getActiveLeafArray(t, state), t->activeArrayCount)) {
         DEBUG_PRINTF("active leaf\n");
         return 0;
     }

+    if (t->ckeyCount) {
+        return 0;
+    }
+
     return 1;
 }

 void roseStreamExec(const struct RoseEngine *t, struct hs_scratch *scratch) {
     DEBUG_PRINTF("OH HAI [%llu, %llu)\n", scratch->core_info.buf_offset,
                  scratch->core_info.buf_offset + (u64a)scratch->core_info.len);
     assert(t);
     assert(scratch->core_info.hbuf);