PyCQA / bandit

Bandit is a tool designed to find common security issues in Python code.
https://bandit.readthedocs.io
Apache License 2.0
6.51k stars 612 forks source link

Flag `markupsafe.Markup` on non-literal content #1067

Open xmo-odoo opened 1 year ago

xmo-odoo commented 1 year ago

Is your feature request related to a problem? Please describe. markupsafe.Markup is semantically similar to Django's mark_safe (or SafeString) (B308), it marks the wrapped string as passing unmodified through markupsafe.escape.

Notably, markupsafe is what Jinja relies on for its (auto)-escaping, it is explicitly documented in that context.

Markup-ing literal content is normally safe, but non-literal content either is suspicious, or should be formatted in via Markup.format or Markup.__mod__ (which automatically escapes non-Markup content).

Describe the solution you'd like Bandit flagging such uses, maybe as part of B308, maybe as a new and separate diagnosis.

Also maybe django.utils.safestring.SafeString and django.utils.safestring.SafeText should be added to B308? mark_safe is just a pretty thin wrapper around SafeString which handles the __html__ protocol and can be used as a decorator.

Daverball commented 1 year ago

There's sort of already an existing pull request for this #877 (it uses the flask alias flask.Markup though) and was soft-rejected on the basis of being maybe too narrow for a third party library rule (although their reasoning does not make sense to me in the context of existing rules)

I would also be very interested in adding a rule or extend the existing B308 rule for this. The proposed implementation in the existing pull request would raise a ton of false positives though and I think we can do better for B308 as well. Since passing string literals should always be safe. It's only once you're passing in dynamic content, that you may be exposing yourself to a XSS vulnerability.

I would be happy to put some time into creating a pull request of my own, if the maintainers are open to a change here.

xmo-odoo commented 1 year ago

There's sort of already an existing pull request for this https://github.com/PyCQA/bandit/pull/877 (it uses the flask alias flask.Markup though)

Ah yeah I searched for mentions of markupsafe so I missed it (also I think I didn't look for PRs anyway so I'd have missed it regardless).

and was soft-rejected on the basis of being maybe too narrow for a third party library rule (although their reasoning does not make sense to me in the context of existing rules)

I would agree, if django as support in core it doesn't make much sense for this not to be even if it were just for flask, but even more so as it's used by multiple projects inside and outside the Pocoo org. including jinja which already has core support. The reasoning could make sense if the Django and Jinja support were moved out of core and bandit-core only concerned itself with the stdlib I guess.

Since passing string literals should always be safe. It's only once you're passing in dynamic content, that you may be exposing yourself to a XSS vulnerability.

Absolutely, the rule should not trigger for literal content (though it should for f-strings, I've no idea whether bandit can easily make the distinction as I know nothing of the implementation details). If mark_safe does, then it would be a good idea to fix that as well.

Daverball commented 1 year ago

Absolutely, the rule should not trigger for literal content (though it should for f-strings, I've no idea whether bandit can easily make the distinction as I know nothing of the implementation details). If mark_safe does, then it would be a good idea to fix that as well.

It's quite easy to detect in the python AST if an argument is a pure literal string or something else (including f-strings). There is an existing flake8 plugin which implements this check, but it has some additional questionable workarounds for gettext _ and a literal function from some other third-party library, it's also currently not published on pypi, so it needs to be installed from git.

Daverball commented 1 year ago

@ericwb @lukehinds @sigmavirus24 I would love for one of the maintainers to chime in on this.

xmo-odoo commented 1 year ago

It's quite easy to detect in the python AST if an argument is a pure literal string or something else (including f-strings).

Oh yeah if bandit just uses ast then it's not difficult, especially if you don't bother trying to const-evaluate things (or at least not more than trivially).