infobyte / faraday

Open Source Vulnerability Management Platform
https://www.faradaysec.com
GNU General Public License v3.0
4.78k stars 885 forks source link

XML plugins unable to parse large files #400

Closed ignis-sec closed 4 years ago

ignis-sec commented 4 years ago

Issue Type

Faraday version

3.11.0

Component Name

Xml Plugins

Steps to reproduce

Upload any huge xml file (nmap reports, mostly, preferably +2mb)

Pip freeze

acme==1.4.0
alabaster==0.7.12
alembic==1.4.2
apispec==3.3.0
apispec-webframeworks==0.5.2
appdirs==1.4.3
aprslib==0.6.47
asn1crypto==0.24.0
attrs==19.3.0
autobahn==20.4.3
Automat==20.2.0
Babel==2.8.0
bcrypt==3.1.7
beautifulsoup4==4.7.1
blinker==1.4
bottle==0.12.15
cairocffi==1.1.0
cbor==1.0.0
certbot==1.4.0
certbot-nginx==1.4.0
certifi==2018.8.24
cffi==1.14.0
Chameleon==3.6.2
chardet==3.0.4
cli-helpers==1.2.1
click==7.1.2
cloud-init==18.3
colorama==0.4.3
ConfigArgParse==0.13.0
configobj==5.0.6
configparser==5.0.0
constantly==15.1.0
cryptography==2.8
Cython==0.29.14
deprecation==2.1.0
distlib==0.3.0
distro==1.4.0
Django==2.2.12
dnspython==1.16.0
docutils==0.16
ecdsa==0.15
enum34==1.1.10
factory-boy==2.12.0
Faker==4.1.0
faraday-plugins==1.1
-e git+https://github.com/infobyte/faraday@1bde0faae4a20c36b0a568e95e00fb517c646f81#egg=faradaysec
feedparser==5.2.1
filedepot==0.7.1
filelock==3.0.12
filteralchemy==0.1.0
filteralchemy-fork==0.1.0
flake8==3.8.1
Flask==1.1.2
Flask-BabelEx==0.9.4
Flask-Classful==0.14.2
Flask-Cors==3.0.8
Flask-KVSession==0.6.2
Flask-KVSession-fork==0.6.3
Flask-Login==0.4.1
Flask-Mail==0.9.1
Flask-Principal==0.4.0
Flask-Restless==0.17.0
Flask-Security==3.0.0
Flask-Session==0.3.1
Flask-SQLAlchemy==2.4.1
Flask-WTF==0.14.3
future==0.18.2
html2text==2019.8.11
html5lib==1.0.1
humanize==2.4.0
hupper==1.10.2
hyperlink==19.0.0
hypothesis==4.18.3
idna==2.6
imagesize==1.2.0
importlib-metadata==1.5.0
incremental==17.5.0
inflection==0.4.0
IPy==1.0
itsdangerous==1.1.0
Jinja2==3.0.0a1
josepy==1.2.0
jsonpatch==1.21
jsonpointer==1.10
jsonschema==2.6.0
lxml==4.5.1
lz4==3.0.2+dfsg
Mako==1.1.2
MarkupSafe==1.1.1
marshmallow==2.21.0
marshmallow-sqlalchemy==0.15.0
mccabe==0.6.1
mimerender==0.6.0
mockito==1.2.1
more-itertools==4.2.0
netaddr==0.7.19
nplusone==1.0.0
oauthlib==2.1.0
olefile==0.46
packaging==20.3
parsedatetime==2.4
passlib==1.7.2
Paste==3.4.0
PasteDeploy==2.1.0
PasteScript==2.0.2
pgcli==2.1.1
pgspecial==1.11.10
Pillow==7.1.2
plaster==1.0
plaster-pastedeploy==0.5
pluggy==0.13.1
prompt-toolkit==2.0.10
psycopg2==2.8.5
psycopg2-binary==2.8.4
py==1.8.1
py-ubjson==0.14.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycairo==1.16.2
pycodestyle==2.6.0
pycparser==2.20
pycryptodomex==3.6.1
pycurl==7.43.0.2
pydot==1.4.1
pyflakes==2.2.0
Pygments==2.6.1
PyHamcrest==2.0.2
PyICU==2.2
pyinotify==0.9.6
PyJWT==1.7.0
PyNaCl==1.3.0
pyOpenSSL==19.1.0
pyparsing==2.4.6
pypcapfile==0.12.0
pypng==0.0.20
PyQRCode==1.2.1
pyramid==1.10.4
pyRFC3339==1.1
pyserial==3.4
pytest==5.4.2
pytest-factoryboy==2.0.3
pytest-flake8==1.0.6
python-dateutil==2.8.1
python-editor==1.0.3
python-magic==0.4.16
python-mimeparse==1.6.0
python-snappy==0.5.3
PyTrie==0.2
pytz==2019.3
PyYAML==5.3.1
repoze.lru==0.7
requests==2.23.0
requests-toolbelt==0.8.0
responses==0.10.14
selenium==4.0.0a1
service-identity==18.1.0
setproctitle==1.1.10
simplejson==3.17.0
simplekv==0.13.0
six==1.12.0
snowballstemmer==2.0.0
soupsieve==2.0
speaklater==1.3
Sphinx==3.0.3
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==1.0.3
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.4
SQLAlchemy==1.3.17
sqlalchemy-schemadisplay==1.3
sqlparse==0.3.1
syslog-rfc5424-formatter==1.1.1
tabulate==0.8.7
Tempita==0.5.2
terminaltables==3.1.0
text-unidecode==1.3
tornado==5.1.1
tqdm==4.46.0
translationstring==1.3
Twisted==20.3.0
txaio==20.4.1
u-msgpack-python==2.3.0
ufw==0.36
Unidecode==1.1.1
urllib3==1.24.1
venusian==3.0.0
virtualenv==20.0.16
waitress==1.4.1
wcwidth==0.1.9
webargs==5.5.3
webencodings==0.5.1
WebOb==1.8.6
websocket-client==0.53.0
WebTest==2.0.34
Werkzeug==0.16.1
wfuzz==2.4.5
wsaccel==0.6.2
WTForms==2.1
XlsxWriter==1.2.8
zipp==1.0.0
zope.component==4.3.0
zope.deprecation==4.4.0
zope.event==4.4
zope.hookable==5.0.1
zope.interface==4.7.1

).

OS

PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
ignis-sec commented 4 years ago

I've tracked this issue to its source, seems to be lxml related.

ET.fromString(xml_output) returns None shortlyafter.

Turns out libxml added this line was added 12 years ago after CVE-2008-4226, which is limiting the maximum size that can be parsed at once to 10mb.

There seems to be no good solution for parsing from string, however ET.iterparse is supposed to be used for parsing large xml files.

ignis-sec commented 4 years ago

Whoops, closing the issue. Turns out it was silently rejecting a structural issue, because xml reports of incomplete scans are missing the </nmaprun> tag at the end.