geopython / GeoHealthCheck

Service Status and QoS Checker for OGC Web Services
https://geohealthcheck.org
MIT License
84 stars 71 forks source link

lxml.etree.XMLSyntaxError: internal error: Huge input lookup #361

Closed bart-v closed 3 years ago

bart-v commented 3 years ago

The check plugin script returns an error internal error: Huge input lookup when the internal/memory size of an XML file is over 10MB

Steps to reproduce the behavior, e.g.:

Add this WMS https://ec.oceanbrowser.net/emodnet/Python/web/wms?service=WMS&version=1.3.0&request=GetCapabilities

Expected Behavior No error

Screenshots or Logfiles

Message: (<class 'lxml.etree.XMLSyntaxError'>, XMLSyntaxError('internal error: Huge input lookup, line 161640, column 353'), <traceback object at 0x7f074f6aa690>))

Context:

If running with Docker:

Additional context Exactly the same issue reported here https://github.com/KimiNewt/pyshark/issues/75

justb4 commented 3 years ago

Hm, the PR #362 the failing Unit Tests (somehow not when running with Docker) are due to this error: (first thought test was failing due to today's PDOK problems).

2021-04-30 12:59:01,424 - GeoHealthCheck.probe - INFO - Check: fun=GeoHealthCheck.plugins.check.checks.XmlParse result=False
2021-04-30 12:59:01,424 - GeoHealthCheck.probe - INFO - Result: success=False msg=(<class 'TypeError'>, TypeError("'huge_tree' is an invalid keyword argument for XMLParser()"), <traceback object at 0x10aa6e408>) response_time=0.791806

Now GeoHealthCheck.plugins.check.checks.XmlParse gets etree not from lxml directly but : from owslib.etree import etree. Somehow only in Docker is lxml installed. So moved lxml==4.6.3 from Docker-only requirements to core requirements.txt. Now at least Main CI Workflow succeeds. Expect Docker workflow to be ok as well.

justb4 commented 3 years ago

Issue solved, closing, re-open if necessary.