When getting big results with gmp.get_reports and shell_mode=True we get a "huge text node" error.
For example this error.
Error: xmlSAX2Characters: huge text node, line 12679, column 561 (, line 12679)
It seems like lxml don't like big files without a Parser option.
_hugetree - disable security restrictions and support very deep trees and very long text content (only affects libxml2 2.7+) >> lxml.de/parsing.html#parsers
This is the diff for my ugly hack that handles "huge text nodes":
(ovas-mgr) falk@broekn ~/_tmp » diff gvm_connection.py-orig gvm_connection.py
39a40,42
> parser = etree.XMLParser(encoding='utf-8', recover=True, huge_tree=False)
> huge_parser = etree.XMLParser(encoding='utf-8', recover=True, huge_tree=True)
>
108c111,117
< tree = etree.parse(f)
---
> try:
> tree = etree.parse(f, parser)
> except Exception as err:
> if 'huge text node' in err.msg:
> tree = etree.parse(f, huge_parser)
> else:
> raise err
135c144
< parser = etree.XMLParser(encoding='utf-8', recover=True)
---
>
Code to reproduce:
#!/usr/bin/python3
from gmp.gvm_connection import TLSConnection
from config import GVM_HOSTNAME, GVM_PORT, GVM_TIMEOUT, GVM_USER, GVM_PASSWD
# gmp has to be global, so the load-function has the correct namespace
gmp = None
# Huge report in openvas (~14M)
rid ='5b5a5053-da06-4ce7-a2e2-39150f16eb53'
def connect(rid):
global gmp
gmp = TLSConnection(hostname=GVM_HOSTNAME, port=GVM_PORT,
timeout=GVM_TIMEOUT, shell_mode=True)
gmp.authenticate(GVM_USER, GVM_PASSWD)
def get_report(rid):
try:
report = gmp.get_reports(report_id=rid)
r = report
except Exception as e:
print('Error: ' + str(e))
r = None
return r
if __name__ == '__main__':
connect(rid)
r = get_report(rid)
print(r)
As you can see from my code, I'm no python coder and I don't know if this is of any interest, but perhaps it can help somewhat or someone :)
When getting big results with gmp.get_reports and shell_mode=True we get a "huge text node" error.
For example this error.
It seems like lxml don't like big files without a Parser option. _hugetree - disable security restrictions and support very deep trees and very long text content (only affects libxml2 2.7+) >> lxml.de/parsing.html#parsers
This is the diff for my ugly hack that handles "huge text nodes":
Code to reproduce:
As you can see from my code, I'm no python coder and I don't know if this is of any interest, but perhaps it can help somewhat or someone :)
-- Regards Falk