nitely / nim-regex

Pure Nim regex engine. Guarantees linear time matching
https://nitely.github.io/nim-regex/
MIT License
227 stars 20 forks source link

Implement literals optimization #59

Closed nitely closed 4 years ago

nitely commented 4 years ago

This seems like a really useful optimization. When the regex contains a literal, we can skip over the text until the first letter is found (i.e: calling memchr). Nim's re find and findAll are at least an order of magnitud faster because of this.

It should likely be part of another package that implements fast multi-literal searching foo|bar|baz, using both memchr when there's one or two literals, and a DFA or Aho–Corasick algorithm. It should also support unicode (memchr should search the Rune's last byte).

Then nim-regex can find the literal position and then match backward and forward.