mar10 / wsgidav

A generic and extendable WebDAV server based on WSGI
https://wsgidav.readthedocs.io
MIT License
978 stars 150 forks source link

Release 4.3.1 uses an etree.XML keyword not supported by lxml #318

Closed musicinmybrain closed 7 months ago

musicinmybrain commented 7 months ago

Describe the bug

https://github.com/mar10/wsgidav/commit/571c492817e04fc780ac6e802dd794bf80d430d8#diff-1adbd2c820c975a3a89a97ae9b14a0062372e7e5bd81a18a362e71df695bce6cL52 started passing forbid_entities=True to etree.XML(), but the lxml version of this function does not support that keyword.

As a result, starting with 4.3.1, ServerTest.testGetPut and ServerTest::testLocking fail when lxml is installed.

To Reproduce Steps to reproduce the behavior:

  1. Check out the git repository
  2. Run tox -e py311 and observe that all tests pass (or are skipped).
  3. Edit tox.ini and add lmxl to testenv.deps
  4. Run tox -e py311 again and observe that tests/test_scripted.py::ServerTest::testGetPut fails (and ServerTest::testLocking would fail if the -x option to pytest did not halt test execution)

Expected behavior All tests pass with and without lxml.

Screenshots, Log-Files, Stacktrace

================================================================================================= FAILURES ==================================================================================================
___________________________________________________________________________________________ ServerTest.testGetPut ___________________________________________________________________________________________

self = <tests.test_scripted.ServerTest testMethod=testGetPut>

    def testGetPut(self):
        """Read and write file contents."""
        client = self.client
        # Prepare file content
        data1 = b"this is a file\nwith two lines"
        data2 = b"this is another file\nwith three lines\nsee?"
        # Big file with 10 MB
        lines = []
        line = "." * (1000 - 6 - len("\n"))
        for i in range(10 * 1000):
            lines.append("%04i: %s\n" % (i, line))
        data3 = "".join(lines)
        data3 = util.to_bytes(data3)

        # Cleanup
        client.delete("/test/")
        client.mkcol("/test/")
        client.check_response(201)

        # PUT files
        client.put("/test/file1.txt", data1)
        client.check_response(201)
        client.put("/test/file2.txt", data2)
        client.check_response(201)
        client.put("/test/bigfile.txt", data3)
        client.check_response(201)

        body = client.get("/test/file1.txt")
        client.check_response(200)
        assert body == data1, "Put/Get produced different bytes"

        # PUT with overwrite must return 204 No Content, instead of 201 Created
        client.put("/test/file2.txt", data2)
        client.check_response(204)

        client.mkcol("/test/folder")
        client.check_response(201)

        # if a LOCK request is sent to an unmapped URL, we must create a
        # lock-null resource and return '201 Created', instead of '404 Not found'
>       locks = client.set_lock(  
            "/test/lock-0",
            owner="test-bench",   
            lock_type="write",
            lock_scope="exclusive",
            depth="infinity",
        )

tests/test_scripted.py:250:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <tests.davclient.DAVClient object at 0x7f2090dd4450>, path = '/test/lock-0', owner = 'test-bench', lock_type = 'write', lock_scope = 'exclusive', depth = 'infinity'
headers = {'Content-Type': 'application/xml; charset=utf-8', 'Depth': 'infinity', 'Timeout': 'Infinite, Second-4100000000'}

    def set_lock(
        self,
        path,
        owner,
        lock_type="write",
        lock_scope="exclusive",   
        depth=None,
        headers=None,
    ):
        """Set a lock on a dav resource"""
        root = ElementTree.Element("{DAV:}lockinfo")
        object_to_etree(
            root,
            {"locktype": lock_type, "lockscope": lock_scope, "owner": {"href": owner}},
            namespace="DAV:",
        )
        tree = ElementTree.ElementTree(root)

        # Add proper headers
        if headers is None:
            headers = {}
        if depth is not None:
            headers["Depth"] = depth
        headers["Content-Type"] = "application/xml; charset=utf-8"
        headers["Timeout"] = "Infinite, Second-4100000000"

        body = self._tree_to_binary_body(tree)

        self._request("LOCK", path, body=body, headers=headers)

>       locks = self.response.tree.findall(".//{DAV:}locktoken")
E       AttributeError: 'NoneType' object has no attribute 'findall'

tests/davclient.py:430: AttributeError
------------------------------------------------------------------------------------------- Captured stdout call --------------------------------------------------------------------------------------------
DAVClient: DELETE - /test/
DAVClient: Could not parse response XML: no element found: line 1, column 0

DAVClient: MKCOL - /test/
DAVClient: Could not parse response XML: mismatched tag: line 5, column 2
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>
<html><head>
  <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
  <title>201 Created</title>
</head><body>
  <h1>201 Created</h1>
  <p>201 Created</p>
<hr/>
<a href='https://github.com/mar10/wsgidav/'>WsgiDAV/4.3.2-a1</a> - 2024-03-26 18:20:46.498746
</body></html>
DAVClient: PUT - /test/file1.txt  
DAVClient: Could not parse response XML: mismatched tag: line 5, column 2
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>
<html><head>
  <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
  <title>201 Created</title>
</head><body>
  <h1>201 Created</h1>
  <p>201 Created</p>
<hr/>
<a href='https://github.com/mar10/wsgidav/'>WsgiDAV/4.3.2-a1</a> - 2024-03-26 18:20:46.503801
</body></html>
DAVClient: PUT - /test/file2.txt  
DAVClient: Could not parse response XML: mismatched tag: line 5, column 2
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>
<html><head>
  <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
  <title>201 Created</title>
</head><body>
  <h1>201 Created</h1>
  <p>201 Created</p>
<hr/>
<a href='https://github.com/mar10/wsgidav/'>WsgiDAV/4.3.2-a1</a> - 2024-03-26 18:20:46.508030
</body></html>
DAVClient: PUT - /test/bigfile.txt
DAVClient: Could not parse response XML: mismatched tag: line 5, column 2
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>
<html><head>
  <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
  <title>201 Created</title>
</head><body>
  <h1>201 Created</h1>
  <p>201 Created</p>
<hr/>
<a href='https://github.com/mar10/wsgidav/'>WsgiDAV/4.3.2-a1</a> - 2024-03-26 18:20:46.535498
</body></html>
DAVClient: GET - /test/file1.txt  
DAVClient: Could not parse response XML: syntax error: line 1, column 0
this is a file
with two lines
DAVClient: PUT - /test/file2.txt  
DAVClient: Could not parse response XML: no element found: line 1, column 0

DAVClient: MKCOL - /test/folder   
DAVClient: Could not parse response XML: mismatched tag: line 5, column 2
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>
<html><head>
  <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
  <title>201 Created</title>
</head><body>
  <h1>201 Created</h1>
  <p>201 Created</p>
<hr/>
<a href='https://github.com/mar10/wsgidav/'>WsgiDAV/4.3.2-a1</a> - 2024-03-26 18:20:46.557220
</body></html>
DAVClient: LOCK - /test/lock-0
18:20:46.563 - ERROR   : Error parsing XML string. If lxml is not available, and unicode is involved, then installing lxml _may_ solve this issue.
18:20:46.563 - ERROR   : XML source: b'<?xml version=\'1.0\' encoding=\'UTF-8\'?>\n<ns0:owner xmlns:ns0="DAV:"><ns0:href><ns0:test-bench/></ns0:href></ns0:owner>'
18:20:46.565 - ERROR   : Traceback (most recent call last):
  File "/home/ben/src/forks/wsgidav/wsgidav/error_printer.py", line 50, in __call__
    for v in app_iter:
  File "/home/ben/src/forks/wsgidav/wsgidav/request_resolver.py", line 224, in __call__
    for v in app_iter:
  File "/home/ben/src/forks/wsgidav/wsgidav/request_server.py", line 126, in __call__
    app_iter = provider.custom_request_handler(environ, start_response, method)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/dav_provider.py", line 1622, in custom_request_handler
    return default_handler(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/request_server.py", line 1258, in do_LOCK
    lockdiscovery_el = res.get_property_value("{DAV:}lockdiscovery")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/dav_provider.py", line 672, in get_property_value
    ownerEL = xml_tools.string_to_xml(lock["owner"])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/xml_tools.py", line 47, in string_to_xml
    return etree.XML(text, forbid_entities=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/lxml/etree.pyx", line 3218, in lxml.etree.XML
TypeError: XML() got an unexpected keyword argument 'forbid_entities'

18:20:46.565 - ERROR   : Caught HTTPRequestException(HTTP_INTERNAL_ERROR)
Traceback (most recent call last):
  File "/home/ben/src/forks/wsgidav/wsgidav/error_printer.py", line 50, in __call__
    for v in app_iter:
  File "/home/ben/src/forks/wsgidav/wsgidav/request_resolver.py", line 224, in __call__
    for v in app_iter:
  File "/home/ben/src/forks/wsgidav/wsgidav/request_server.py", line 126, in __call__
    app_iter = provider.custom_request_handler(environ, start_response, method)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/dav_provider.py", line 1622, in custom_request_handler
    return default_handler(environ, start_response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/request_server.py", line 1258, in do_LOCK
    lockdiscovery_el = res.get_property_value("{DAV:}lockdiscovery")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/dav_provider.py", line 672, in get_property_value
    ownerEL = xml_tools.string_to_xml(lock["owner"])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ben/src/forks/wsgidav/wsgidav/xml_tools.py", line 47, in string_to_xml
    return etree.XML(text, forbid_entities=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/lxml/etree.pyx", line 3218, in lxml.etree.XML
TypeError: XML() got an unexpected keyword argument 'forbid_entities'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ben/src/forks/wsgidav/wsgidav/error_printer.py", line 83, in __call__
    raise as_DAVError(e)
wsgidav.dav_error.DAVError: 500   

18:20:46.565 - ERROR   : e.src_exception:
XML() got an unexpected keyword argument 'forbid_entities'
DAVClient: Could not parse response XML: mismatched tag: line 5, column 2
<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01//EN' 'http://www.w3.org/TR/html4/strict.dtd'>
<html><head>
  <meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
  <title>500 Internal Server Error</title>
</head><body>
  <h1>500 Internal Server Error</h1>
  <p>500 Internal Server Error: An internal server error occurred
    Source exception: TypeError(&quot;XML() got an unexpected keyword argument &#x27;forbid_entities&#x27;&quot;)</p>
<hr/>
<a href='https://github.com/mar10/wsgidav/'>WsgiDAV/4.3.2-a1</a> - 2024-03-26 18:20:46.565971
</body></html>

Environment:

WsgiDAV/4.3.2-a1 Python/3.12.2(64 bit) Linux-6.7.9-200.fc39.x86_64-x86_64-with-glibc2.38
Python from: /home/ben/src/forks/wsgidav/_e/bin/python3

(I don’t believe anything about the environment other than the availability of lxml is relevant.)

Which WSGI server was used (cheroot, ext-wsgiutils, gevent, gunicorn, paste, uvicorn, wsgiref, ...)?

N/A, running wsgidav’s own tests.

Which WebDAV client was used (MS File Explorer, MS Office, macOS Finder, WinSCP, Windows, file mapping, ...)?

N/A, running wsgidav’s own tests.

Additional context

I now maintain a python-wsgidav package in Fedora and in EPEL9, which is how I discovered the problem.