Open braedon opened 3 years ago
we've had a lot more trouble with this particular regex than I expected when I implemented it, tbh. It mostly works, but, I think it causes more problems than it solves. You could submit a PR with that change, I definitely want to switch approaches to something faster and less finnicky in the long run though
Armorable.__armor_regex.search()
can run indefinitely on certain text. This appears to be a catastrophic backtracking issue.For example,
pgpy.PGPMessage.from_blob()
will run for hours without completion on this signed security.txt file. I've created a gist that replicates the issue.The file in question doesn't match the regex because it includes a space on each blank line, which the regex doesn't allow for the blank lines after the headers. The file uses
\r\n
line endings, which seems to be required for the issue - presumably due to interactions between the optional\r
s and the various other optional clauses. Replace the line endings with just\n
and the search fails quickly.The issue can be mitigated by changing how the
cleartext
group detects the end of the cleartext block.It currently matches arbitrary lines, and then uses a positive lookahead assertion to match a final line that's followed by a line starting with
-----
. I'm not a regex expert, but I think this is problematic as thecleartext
group can match the whole rest of the text before the lookahead assertion (and other later clauses) makes it backtrack.A negative lookahead assertion can instead be used to stop the cleartext group when it first encounters a line starting with
-----
.Replace:
With:
i.e. match as many lines as you can that aren't followed by
-----
, and then match the final line (implicitly followed by-----
).This change reduces the search failure return time for this file to ~1s on my laptop - not great, but usable.
Note that I don't think the regex should fail to match this file at all - my reading of rfc4880 is that whitespace should be allowed in the blank lines after the headers - but I'll raise another issue for that.