geopython / OWSLib

OWSLib is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models.
https://owslib.readthedocs.io
BSD 3-Clause "New" or "Revised" License
381 stars 273 forks source link

Report MapServer HTML errors in WMS Reader #826

Open geographika opened 2 years ago

geographika commented 2 years ago

When connecting to MapServer to get server capabilities, any resulting errors are reported in HTML.

So for example:

wms = WebMapService(url, version=version)

Can throw the following cryptic error:

  File "venv\lib\site-packages\owslib\wms.py", line 54, in WebMapService
    return wms130.WebMapService_1_3_0(
  File "venv\lib\site-packages\owslib\map\wms130.py", line 97, in __init__
    self._buildMetadata(parse_remote_metadata)
  File "venv\lib\site-packages\owslib\map\wms130.py", line 107, in _buildMetadata
    self.identification = ServiceIdentification(serviceelem, self.version)
  File "venv\lib\site-packages\owslib\map\wms130.py", line 407, in __init__
    self.type = testXMLValue(self._root.find(nspath('Name', WMS_NAMESPACE)))
AttributeError: 'NoneType' object has no attribute 'find'

Once a capabilities request is returned in the following code, we could check if the returned output was HTML or XML, and if it was HTML then report an error and throw an exception:

https://github.com/geopython/OWSLib/blob/a6105226f1c8466c3b1e9998747d0dbf01434129/owslib/map/wms130.py#L77-L81

print(type(self._capabilities))
# if this works correctly then it is
# <Element '{http://www.opengis.net/wms}WMS_Capabilities' at 0x00000221FF4692B0>
# if HTML error
# <class 'xml.etree.ElementTree.Element'>
# we can get raw HTML with:
print(ElementTree.tostring(self._capabilities, encoding='utf8', method='xml'))

However maybe it is better to handle this in WMSCapabilitiesReader directly?

geographika commented 2 years ago

Looks like the content type can be checked here:

https://github.com/geopython/OWSLib/blob/a6105226f1c8466c3b1e9998747d0dbf01434129/owslib/map/common.py#L65-L68

        print(u.info()['Content-Type']) # Good: text/xml; charset=UTF-8 # Bad: text/html

If this should always return XML then perhaps an error could be thrown here with the raw server response if any other content-type is returned?