pixee / codemodder-python

Python implementation of the Codemodder framework
GNU Affero General Public License v3.0
38 stars 10 forks source link

Sonar: fix "XML parsers should not be vulnerable to XXE attacks" with `use-defusedxml` codemod #295

Open drdavella opened 7 months ago

drdavella commented 7 months ago

We should be able to use the existing use-defusedxml codemod to fix this issue. This will require separating the transformer implementation from the existing codemod so that it can be used by the new Sonar codemod.

Rule description: https://rules.sonarsource.com/python/RSPEC-2755/

clavedeluna commented 7 months ago

I see a couple issues that need to be addressed at this time.

  1. the sonar rule is more generic than our codemod. The rule can be mapped to both use-defusedxml (catches stuff for xml lib) and 2 codemods lxml-safe-parsing, safe-lxml-parser-defaults (catch stuff for lxml. How do we represent one sonar rule mapping to 3 codemodder transformers in codemodder?

  2. I noticed that what sonar would detect as not-compliant is not the same code codemodder would detect. From sonar docs :

# noncompliant exampe from sonar
import xml.sax
parser = xml.sax.make_parser()
myHandler = MyHandler()
parser.setContentHandler(myHandler)
parser.setFeature(feature_external_ges, True) # Noncompliant
parser.parse('xxe.xml')

You read this and I assume they're going to flag the line that says # Noncompliant. I tested this out on our sonarcloud instance with just this code to see if it detects it:

import xml.sax
parser = xml.sax.make_parser()
parser.parse('xxe.xml')

This is the same code we test in test_use_defusedxml.py. As you can see, we flag on making the parser, not on setting a feature. I ran this on sonarcloud and it did not flag this last snippet, while we would.

So, how exact should the codemodder codemods to sonar relationship be? should we only create sonar codemods for stuff that our codemod would exactly match?

drdavella commented 7 months ago

How do we represent one sonar rule mapping to 3 codemodder transformers in codemodder?

This is a good question. We should be able to do this with the new transformer pipeline API, although this would be the first codemod that uses more than a single transformer. I think it's a good test case although you are probably going to encounter a few bumps along the road. This would mean defining a LibcstTransformerPipeline with three transformers, one corresponding to each existing codemod.

Here's what I would suggest: for this particular Sonar finding, we want to apply those three transformers to the entire file. This means we don't need to filter by line number, etc. but only by file. The tricky part is going to be making sure that we apply this codemod to this particular file only once. Sonar could potentially report multiple results per file for this rule, but we know that running this codemod once will fix all of the problems.

Let me know whether this makes sense. We'll probably need to iterate a bit and if you'd like to look at a simpler case first that's okay.