mailgun / talon

Apache License 2.0
1.27k stars 285 forks source link

Quotations parser gets stuck #168

Closed Nipsuli closed 5 years ago

Nipsuli commented 6 years ago

I had a an email where mark_message_lines produces following markers:

'tetetesssssmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmeetttttttttteteetttttttttteeetttttteeettttteeetttttttttteeettttt'

that gets completely stuck in re.finditer('(?<=m)e*((?:t+e*)+)m', markers) on line 289 in quotations.py

Basically the regex causes Catastrophic Backtracking

talon 1.4.4 Python 3.6.5 [GCC 6.3.0 20170516] on linux

ad-m commented 6 years ago

I think it should be rewritten as (?<=m)e*(t[te]+)m. What do you think, @Nipsuli ?

>>> re.search('(?<=m)e*((?:t+e*)+)m', 'metetetetetetetem').groups()
('tetetetetetete',)
>>> re.search('(?<=m)e*(t[te]+)m', 'metetetetetetetem').groups()
('tetetetetetete',)

That issues hit me also. Few times in last month.

Nipsuli commented 6 years ago

Seems to be in the correct direction. I think the quantifier should be * instead of + so (?<=m)e*(t[te]*)m

>>> re.search('(?<=m)e*((?:t+e*)+)m', 'metetetetetetetem').groups()
('tetetetetetete',)
>>> re.search('(?<=m)e*(t[te]+)m', 'metetetetetetetem').groups()
('tetetetetetete',)
>>> re.search('(?<=m)e*(t[te]*)m', 'metetetetetetetem').groups()
('tetetetetetete',)
>>>
>>> re.search('(?<=m)e*((?:t+e*)+)m', 'mtm').groups()
('t',)
>>> re.search('(?<=m)e*(t[te]+)m', 'mtm').groups()  # AttributeError: 'NoneType' object has no attribute 'groups'
>>> re.search('(?<=m)e*(t[te]*)m', 'mtm').groups()
('t',)
ad-m commented 6 years ago

@Nipsuli , I created a new pull requests as #172 to push forward that issue.

obukhov-sergey commented 5 years ago

@ad-m fixed with https://github.com/mailgun/talon/pull/172, thx for the PR!

teeberg commented 5 years ago

Is this worth requesting a CVE since it can be used to cause a denial of service? https://www.google.com/search?q=CVE+catastrophic+backtracking