splunk / splunk-sdk-python

Splunk Software Development Kit for Python
http://dev.splunk.com
Apache License 2.0
687 stars 369 forks source link

Regular expression in CommandLineParser exhibits catastrophic backtracking #309

Closed mschwager closed 3 years ago

mschwager commented 4 years ago

Hi there! I've been working on a new Python static analysis tool called Dlint. Most recently I've been working on a rule that searches for regular expression denial-of-service: DUO138. When running this rule against your codebase I found the following violations:

$ python -m flake8 --select=DUO138 splunklib/
splunklib/searchcommands/internals.py:217:21: DUO138 catastrophic "re" usage - denial-of-service possible
splunklib/searchcommands/internals.py:235:22: DUO138 catastrophic "re" usage - denial-of-service possible
splunklib/searchcommands/internals.py:237:19: DUO138 catastrophic "re" usage - denial-of-service possible
splunklib/searchcommands/search_command.py:834:22: DUO138 catastrophic "re" usage - denial-of-service possible

After further investigation, it appears the violation in internals.py at line 235 is a true positive and the rest are false positives. If we dig in further:

class CommandLineParser(object):
    ...
    _fieldnames_re = re.compile(r"""("(?:\\.|""|[^"])+"|(?:\\.|[^\s"])+)""")

The violation occurs due to (?:\\.|""|[^"])+. This is due to mutually inclusive alternation within a quantifier. Since \\. and [^"] have character overlap. We can confirm the bug with a specially crafted string and the following code:

from splunklib.searchcommands import internals
internals.CommandLineParser._fieldnames_re.search('"' + r'\\.' * 64 + "'")
...Spins...

To fix the issue we should avoid the character overlap between \\. and [^"]. One way to accomplish this is with the following expression:

_fieldnames_re = re.compile(r"""("(?:\\.|""|[^"\\])+"|(?:\\.|[^\s"])+)""")

Note that the other false positives found contain similar expressions, and despite not being vulnerable to this issue, it may be worth updating them as well. This will avoid them becoming vulnerable in the future.

If you'd like to learn more about ReDoS and catastrophic backtracking I recommend checking out this blog post. I hope this helps, let me know if you have any questions!

shakeelmohamed commented 4 years ago

Thanks for the bug report @mschwager, this is very helpful information. We are unable to implement these changes at the moment but we are always open to a pull request

pedworthy-r7 commented 3 years ago

Ping, incase it wasn't noticed @mschwager has posted a fix for this.

This vulnerability was brought to my attention by a client of ours who uses Sonatype which is reporting this library as a security risk (sonatype-2020-0090).

bclarkejr commented 3 years ago

@shakeelmohamed - It looks like @mschwager posted a fix for this shortly after your comment (going on a year now). Any chance we could get this merged? It is a vulnerability as of late 2020, like @pedworthy-r7 mentioned.

Happy to help however you need. Thanks!

fantavlik commented 3 years ago

Fix is now available in version 1.6.16 https://github.com/splunk/splunk-sdk-python/releases/tag/1.6.16 install with pip install splunk-sdk==1.6.16