Closed litios closed 2 weeks ago
Would love to see that.
@litios this sounds indeed like a valuable contribution for the library and the users. I would be more than happy to review and approve such PR.
I would prefer some clarification here, as I want to use xmltodict for a project where security against malicious input is relevant. I found this bug, as well as another bugs indicating unspecified security concerns #279.
However the report above is unclear what exact security issues it is talking about.
To my knowledge there are generally two classes of XML-specific vulnerabilities to be concerned about: DoS issues and include issues.
As for DoS issues, there are the entity expansion issues that create a lot of data, which includes issues like billion laughs or quadratic blowup, as listed in Python's list. However this is mitigated in expat since a while. The somewhat different hashdos issue is also mitigated in expat via the use of siphash and a random salt.
Then there are the include issues. Via DTD, entity expansion or the xinclude feature one may access local or external files, which could be used for a variety of attacks like data exfiltration or SSRF. However all three features are not enabled by default in the etree API, and in a test of mine none of these issues was exploitable in xmltodict.
According to Python's doc external entity expansion was disabled in various APIs in Python 3.7.1. Python 3.6 is already out of support and python 3.7 will be out of support soon. It is unclear to me where the claim that "not so older python versions like 3.8" are affected by any of this comes from.
In summary it is not clear to me which security issues exist, except maybe with extremely outdated python or expat versions with known problems.
It would probably be good to remove the use of the obsolete defusedexpat import, but it is not clear to me whether defusedxml is still needed.
The main idea was to point out that, right now, vulnerable Python versions won't have a safe alternative for untrusted input to known attacks. If running modern patched Python versions, you should be safe with expat.
Python 3.8, for example, is no longer affected since v3.8.12 but older 3.8 versions will be affected. Same for Python 3.7. And Python3.8 won't be out of support until the end of 2024.
And I agree that you will still need to enable allowing entities for attacks like billion laughs to work. I simply wanted to point out that defusedxml could be a good alternative to the old defusexpat meanwhile there are vulnerable Python expat versions out there.
@martinblech I don't have the bandwidth right now to provide a PR for this. But I still wanted to raise the issue, just in case someone else wants to implement it in the meantime.
Python 3.8, for example, is no longer affected since v3.8.12 but older 3.8 versions will be affected. Same for Python 3.7. And Python3.8 won't be out of support until the end of 2024.
@litios I think there may be a misunderstanding here, let's find out and be sure. To my understanding Python 3.8 being supported means that that the CPython project continues to do new 3.8.x releases with security fixes and that users can expected to be safe and receive updates if they run the latest 3.8.x version at any point in time, e.g. version 3.8.16 as of today. If someone is running say 3.8.8 today, they are using a supported track of the language but cannot expect to be secure. As a result, users running vanilla 3.8.x <=3.8.15 (meaning without backports) need to update their CPython and do not need any particular concern here. Does that sound correct?
@hartwork yes, I totally agree. I'm sorry if what I wrote was unclear. Thanks for clarifying it!
Just to summarize this issue, as I think this can confuse others who might find this report: There is no security problem with currently supported versions of expat and python.
The security issues that were mitigated by defusedexpat are issues in older, unsupported versions of expat and python. I have therefore created a PR to remove the obsolete use of defusedexpat.
I think this can be closed now.
Hello,
Ubuntu is working to include xmltodict in our next release. As part of the "main inclusion review" (MIR) process to add software to our main package repository, we perform security audits. You can find the full xmltodict MIR bug report here
While auditing it, we found that the library is vulnerable to typical XML attacks due to relying on expat as defusedexpat doesn't have support since 2016 and only up to Python 3.4.
In newer versions of expat these issues seem to be mitigated:
but it still affects not so older python versions like 3.8
As recommended by the documentation, there is currently a supported alternative called defusedxml owned by the same maintainer that owned defusedexpat.
It would be very beneficial to replace the old
defusedexpat
withdefusedxml
. Does that sound like something that could be valuable to the library and the users?Thanks in advance!