rapid7 / nexpose-client-python

DEPRECATED : Rapid7 Nexpose API client library written in Python
https://www.rapid7.com/
BSD 3-Clause "New" or "Revised" License
25 stars 20 forks source link

Unable to execute the request: xmlSAX2Characters: huge text node #47

Open fruechel opened 6 years ago

fruechel commented 6 years ago

Generating a CSV AdHoc report with a large result leads to the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/nexpose/nexpose.py", line 306, in _Execute_APIv1d1
    return Execute_APIv1d1(self._URI_APIv1d1, request, self.timeout)
  File "/usr/local/lib/python3.6/site-packages/nexpose/nexpose.py", line 66, in Execute_APIv1d1
    return as_xml(response)
  File "/usr/local/lib/python3.6/site-packages/nexpose/xml_utils.py", line 65, in as_xml
    return from_large_string(s).getchildren()[0]
  File "/usr/local/lib/python3.6/site-packages/nexpose/xml_utils.py", line 51, in from_large_string
    return etree.XML(s.encode('utf-8'))
  File "src/lxml/etree.pyx", line 3209, in lxml.etree.XML (src/lxml/etree.c:80823)
  File "src/lxml/parser.pxi", line 1871, in lxml.etree._parseMemoryDocument (src/lxml/etree.c:121250)
  File "src/lxml/parser.pxi", line 1759, in lxml.etree._parseDoc (src/lxml/etree.c:119926)
  File "src/lxml/parser.pxi", line 1125, in lxml.etree._BaseParser._parseDoc (src/lxml/etree.c:114173)
  File "src/lxml/parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/etree.c:107738)
  File "src/lxml/parser.pxi", line 709, in lxml.etree._handleParseResult (src/lxml/etree.c:109447)
  File "src/lxml/parser.pxi", line 638, in lxml.etree._raiseParseError (src/lxml/etree.c:108301)
  File "<string>", line 129881
lxml.etree.XMLSyntaxError: xmlSAX2Characters: huge text node, line 129881, column 54

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/vulnbutler/app.py", line 169, in sync_with_nexpose_api
    scan_created_issues = process_scan_id(scan_id, site_id)
  File "/usr/local/lib/python3.6/site-packages/vulnbutler/app.py", line 75, in process_scan_id
    results = nexpose_connection.get_nexpose_connection().export_scan(scan_id)
  File "/usr/local/lib/python3.6/site-packages/vulnbutler/nexpose_connection.py", line 30, in export_scan
    scan_id, format='csv', template_id=template_id,
  File "/usr/local/lib/python3.6/site-packages/nexpose/nexpose.py", line 2052, in GenerateScanReport
    data = self.RequestReportAdhocGenerate(scan_or_id, format, template_id)
  File "/usr/local/lib/python3.6/site-packages/nexpose/nexpose.py", line 638, in RequestReportAdhocGenerate
    return self.ExecuteBasicWithElement("ReportAdhocGenerateRequest", {}, as_xml(request_data))
  File "/usr/local/lib/python3.6/site-packages/nexpose/nexpose.py", line 334, in ExecuteBasicWithElement
    return self._Execute_APIv1d1(request)
  File "/usr/local/lib/python3.6/site-packages/nexpose/nexpose.py", line 308, in _Execute_APIv1d1
    raise NexposeConnectionException("Unable to execute the request: {0}!".format(ex), ex)
nexpose.nexpose.NexposeConnectionException: Unable to execute the request: xmlSAX2Characters: huge text node, line 129881, column 54 (<string>, line 129881)!

Possible Solution

I haven't looked at the format of the response yet but I assume the CSV is wrapped in an XML which means there is a huge text node inside. Possibly it could be streamed or the limit can be raised (or made configurable). Any change in limit will always be prone to fail at some point though, only solutions such as streaming the text will provide an appropriate fix.

Steps to Reproduce (for bugs)

Context

This is in an attempt to export results as CSV. This was introduced (by me) in #43, specifically e80bd81 contains the changes. This bug prevents larger scans from being exported as CSV.