[x] I searched issues and couldn’t find anything (or linked relevant results below)
[x] If applicable, I’ve added docs and tests
Description of changes
This is a quick fix for the problem described in #8, where a long line causes the findEmail regular expression to exhibit pathological behavior.
The solution presented here is to replace every + in the regular expression with {1,255}; it will still be super-linear on long lines but the buffer is now small enough that the function will complete in an acceptable period of time.
The maximum length of a valid email address is 255 according to the IETF here, but I've left room in the regex for 64 before and 255 after the @.
This only improves matters, doesn't fix the problem entirely.
Before this PR, line lenghts of up to about 50,000 cause recursion failure:
With this PR, I can run the regex on strings of length up to 10 megabytes, which seems like a very comfortable line length, certainly a big improvement:
but it still fails with a recursion limit above that
Initial checklist
Description of changes
This is a quick fix for the problem described in #8, where a long line causes the
findEmail
regular expression to exhibit pathological behavior.The solution presented here is to replace every
+
in the regular expression with{1,255}
; it will still be super-linear on long lines but the buffer is now small enough that the function will complete in an acceptable period of time.The maximum length of a valid email address is 255 according to the IETF here, but I've left room in the regex for 64 before and 255 after the
@
.This only improves matters, doesn't fix the problem entirely.
Before this PR, line lenghts of up to about 50,000 cause recursion failure:
With this PR, I can run the regex on strings of length up to 10 megabytes, which seems like a very comfortable line length, certainly a big improvement:
but it still fails with a recursion limit above that
closes #8