Literals substring optimization

This reuses the lits opt to find the lit delimiter, then grabs the surrounding lits, and tries to find the literal sub-string. It should usually be faster when memmem is supported. Otherwise it depends on how many candidates to memchr there are in the input text vs lit substring candidates, so it's input text dependent.

I also added support for unicode lit delimiters.

Added some benchs where it's ~6x faster than before, and 3x faster than PCRE.