telekons / one-more-re-nightmare

A fast regular expression compiler in Common Lisp
https://applied-langua.ge/projects/one-more-re-nightmare/
BSD 2-Clause "Simplified" License
138 stars 9 forks source link

Runtime feedback for SIMD and repetition #10

Open no-defun-allowed opened 2 years ago

no-defun-allowed commented 2 years ago

Consider the regular expression "(¬")*", which will match some text enclosed in double quotes.

There are two ways to run the "tight loop" currently:

The latter is faster than the former for long texts, but not for shorter texts. (I suspect less than one SIMD vector wide? idk) So to achieve best performance, we should choose the more appropriate tight loop implementation. There is no indication of what lengths are expected in the regular expression, and we have no way to annotate such things, so we must rely on runtime feedback (which is probably for the better, in terms of performance/effort on behalf of the user).

According to Cliff Click (from a conversation in the coffee compiler club) it should suffice to maintain a count of how many times the tight loop is iterated. Upon entering the loop, we compare the last count to some crossover point. If we have a higher count, we take the SIMD route, else we use scalar code.

There is also a case for unrolling the scalar code according to Gilbert Baumann, but I haven't tried that yet. For very long loop counts, unrolling the SIMD code may also produce some benefit, but we may face more code bloat; I still intend to compile everything eagerly, because it is simpler, and the duplicate code is not too large anyway.

no-defun-allowed commented 2 years ago

More complex tests, such as that for ab|ac, will end up using all the vector pipes of a processor and unrolling won't help there. We need pretty simple tests to benefit from unrolling SIMD code, e.g. just a single code point to test for, rather than a range.