plone / Products.CMFPlone

The core of the Plone content management system
https://plone.org
GNU General Public License v2.0
254 stars 191 forks source link

Plone 6: Virtual Host Monster-refereed URLs fail when WebDAV requests are made to the server #3386

Open Rudd-O opened 2 years ago

Rudd-O commented 2 years ago

TL;DR: WebDAV is broken in Plone 6 when VirtualHostMonster is used. It worked fine in Plone 5.

Context: I need VHM to work because I need clients to have DAV file manage access exclusively to their sites, and no more, via their domain names and their Plone usernames/passwords.

In Plone 5.2 / Zope 4.5.3, VHMed HTTP request to a Plone site OPTIONS https://site.com/VirtualHostBase/https/site.com/Plone/VirtualHostRoot/ HTTP/1.0 succeeds without an issue, and DAV clients can connect fine. Same is true for PROPFIND. PUT works as expected.

In Plone 6.0 alpha / Zope 5.3, the exact same HTTP request through VHM returns 404. Same is true for PROPFIND. PUT does not allow uploads. GET works because it's special-cased (the code tacks a manage_DAVget at the end of the URL).

As a consequence of this malfunction, DAV clients cannot connect to Plone 6 sites.

Both tests were performed through the configured webdav-source-port, on plain vanilla Plone setups (Zope 4 = Plone 5.2, Zope 5 = Plone 6).

Importantly:

More details:

With plone site at /Site and zope folder at /folder (both on the Zope root):

Something must be wrong with traversal?

Log of tests via VHM:

# This is the Zope 4 server.
*   Trying 127.0.1.3:8081...
* Connected to 127.0.1.3 (127.0.1.3) port 8081 (#0)
> PROPFIND /VirtualHostBase/https/site.com/Plone/VirtualHostRoot/ HTTP/1.1
> Host: 127.0.1.3:8081
> User-Agent: curl/7.76.1
> Accept: */*
> Authorization: Basic <censored>
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 207 Multi-Status
< Content-Length: 3663
< Content-Location: https://site.com:8081/
< Content-Type: text/xml; charset="utf-8"
< Date: Tue, 21 Dec 2021 18:44:38 GMT
< Date: Tue, 21 Dec 2021 18:44:38 GMT
< Server: waitress
< Via: waitress
< X-Cache-Rule: plone.content.folderView
< X-Powered-By: Zope (www.zope.org), Python (www.python.org)
< 
{ [3663 bytes data]
* Connection #0 to host 127.0.1.3 left intact

# This is the Zope 5 server.
*   Trying 127.0.1.4:8081...
* Connected to 127.0.1.4 (127.0.1.4) port 8081 (#0)
> PROPFIND /VirtualHostBase/https/site.com/Plone/VirtualHostRoot/ HTTP/1.1
> Host: 127.0.1.4:8081
> User-Agent: curl/7.76.1
> Accept: */*
> Authorization: Basic <censored>
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
< Content-Length: 26
< Content-Type: application/json
< Date: Tue, 21 Dec 2021 18:44:38 GMT
< Server: waitress
< Via: waitress
< X-Powered-By: Zope (www.zope.org), Python (www.python.org)
< 
{ [26 bytes data]
* Connection #0 to host 127.0.1.4 left intact

Originally opened as https://github.com/zopefoundation/Zope/issues/996

Rudd-O commented 2 years ago

I've done some more printf() debugging.

In both VHM and non-VHM cases, PloneSite.__before_publishing_traverse__() is being called, again in both cases with identical parameters.

However:

Something is eating the VHM request before it gets to `listDAVObjects().

Rudd-O commented 2 years ago

Here is a PROPFIND request without VHM:

# WSGIPublisher().publish() right before calling obj = request.traverse(...)
going to traverse traverse {'debug_mode': True,
 'module_info': (<Application at >, 'Intranet', True),
 'obj': <Application at >,
 'path': '/site.com/',
 'pprint': <module 'pprint' from '/home/user/optpython/3.8.6/lib/python3.8/pprint.py'>,
 'realm': 'Intranet',
 'request': <WSGIRequest, URL=http://127.0.1.4:8081>,
 'response': WSGIResponse(b''),
 'sys': <module 'sys' (built-in)>}

Same request, going through VHM:

# WSGIPublisher().publish() right before calling obj = request.traverse(...)
going to traverse traverse {'debug_mode': True,
 'module_info': (<Application at >, 'Intranet', True),
 'obj': <Application at >,
 'path': '/VirtualHostBase/https/site.com/site.com/VirtualHostRoot/',
 'pprint': <module 'pprint' from '/home/user/optpython/3.8.6/lib/python3.8/pprint.py'>,
 'realm': 'Intranet',
 'request': <WSGIRequest, URL=http://127.0.1.4:8081>,
 'response': WSGIResponse(b''),
 'sys': <module 'sys' (built-in)>}

I think path is supposed to have the whole path, but isn't VHM supposed to have kicked in at this point in the stack?

Rudd-O commented 2 years ago

HTTP PUT is also broken with VHM. Without VHM, PUT returns Created or No Content. With VHM it returns 409 Conflict.

Traceback (innermost last):
  Module ZPublisher.WSGIPublisher, line 167, in transaction_pubevents
  Module ZPublisher.WSGIPublisher, line 376, in publish_module
  Module ZPublisher.WSGIPublisher, line 255, in publish
  Module ZPublisher.BaseRequest, line 522, in traverse
  Module ZPublisher.BaseRequest, line 348, in traverseName
  Module ZPublisher.BaseRequest, line 82, in publishTraverse
  Module webdav.NullResource, line 86, in __bobo_traverse__
webdav.common.Conflict: Collection ancestors must already exist.

I an convinced now that there's a bad traversal error when VHM is enabled.

Rudd-O commented 2 years ago

Yep I have identifed what the problem is.

Zope's BaseRequest.py attempts to look up virtual_hosting in traverseName().

In Plone 5 / Zope 4, there is no IPublishTraverse adapter associated to the PloneSite passed to traverseName(PloneSite, "virtual_hosting"). The default publisher traverse kicks in, and that somehow correctly looks up the requested object.

In Plone 6 / Zope 5, PloneSite has a plone.dexterity.browser.traversal.DexterityPublishTraverse associated with it. This adapter, sadly, does not know how to look up virtual_hosting — it actually attempts to look up an object called virtual_hosting inside the Plone site object. This obviously fails.

So there's the issue — a new registration for the DexterityPublishTraverse.

This does not show up with the GET method in the WebDAV mode, because manage_DAVget is used as a special case in this scenario.

Rudd-O commented 2 years ago

Some printf debugging confirms DexterityPublishTraverse does not appear to be called in Plone 5 to look up object virtual_hosting inside the Plone site, while it is called in Plone 6. The following log is printfs() in Plone 6.

traverseName /home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/ZPublisher/BaseRequest.py <Application at > manuelamador.name
querying adapter provided?
adapter <Products.CMFPlone.browser.admin.AppTraverser object at 0x78222cc4a5e0>
publish traverse
returning from traverseName /home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/ZPublisher/BaseRequest.py <PloneSite at manuelamador.name>
traverseName /home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/ZPublisher/BaseRequest.py <PloneSite at manuelamador.name> virtual_hosting
querying adapter provided?
adapter <plone.dexterity.browser.traversal.DexterityPublishTraverse object at 0x78222cc4a550>
publish traverse
DexterityPublishTraverse being called for virtual_hosting
returning from traverseName /home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/ZPublisher/BaseRequest.py <webdav.NullResource.NullResource object at 0x78222ff8f040>
traverseName /home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/ZPublisher/BaseRequest.py <webdav.NullResource.NullResource object at 0x78222ff8f040> /
querying adapter provided?
adapter None
adapter is none
publish traverse

My guess is that when the Plone Site itself became a Dexterity content object, this kicked in.

Either:

I wish I knew how to do either of these properly!

david-batranu commented 1 year ago

Ran into this on Plone 6.0.0.2 while doing a HEAD request (curl -I).

2023-01-23 13:31:40,434 ERROR   [Zope.SiteErrorLog:35][waitress-1] NotFound: https://plone6.localsite.org/Plone/virtual_hosting//
Traceback (innermost last):
  Module ZPublisher.WSGIPublisher, line 172, in transaction_pubevents
  Module ZPublisher.WSGIPublisher, line 381, in publish_module
  Module ZPublisher.WSGIPublisher, line 260, in publish
  Module ZPublisher.BaseRequest, line 515, in traverse
  Module ZPublisher.BaseRequest, line 348, in traverseName
  Module ZPublisher.BaseRequest, line 105, in publishTraverse
  Module ZPublisher.BaseRequest, line 82, in publishTraverse
  Module webdav.NullResource, line 87, in __bobo_traverse__
zExceptions.NotFound: The requested resource was not found.

Apache VH:

RewriteRule ^/(.*) http://localhost:8080/VirtualHostBase/https/%{HTTP_HOST}/Plone/VirtualHostRoot/$1 [P,L]
rber474 commented 11 months ago

Found this bug in Plone 6.0.7, using VHM, for a HTTP PUT request. Both for a BrowserView or a plone.rest.Service As @Rudd-O guessed, the plone.dexterity.browser.traversal.DexterityPublishTraverse adapter is called. As request is marked as maybe_webdav_client, the publishTraverse method returns a NullResource.

As provisional turnaround, I have overriden the DexterityPublishTraverse adapter, to include a conditional statement

"virtual_hosting" not in request.URL

modified method

    def publishTraverse(self, request, name):
        context = aq_inner(self.context)

        # If we are trying to traverse to the folder "body" pseudo-object
        # returned by listDAVObjects(), return that immediately

        if (
            getattr(request, "maybe_webdav_client", False)
            and name == DAV_FOLDER_DATA_ID
        ):
            return FolderDataResource(DAV_FOLDER_DATA_ID, context).__of__(context)

        defaultTraversal = super().publishTraverse(request, name)

        # If this is a WebDAV PUT/PROPFIND/PROPPATCH request, don't acquire
        # things. If we did, we couldn't create a new object with PUT, for
        # example, because the acquired object would shadow the NullResource

        if (
            getattr(request, "maybe_webdav_client", False)
            and request.get("REQUEST_METHOD", "GET")
            not in (
                "GET",
                "POST",
            )
            and IAcquirer.providedBy(defaultTraversal)
            and 'virtual_hosting' not in request.URL
        ):
            parent = aq_parent(aq_inner(defaultTraversal))
            if parent is not None and parent is not context:
                return NullResource(self.context, name, request).__of__(self.context)

        return defaultTraversal

Maybe someone can take a glance at this because I don't feel comfortable with Publisher and Traversals 😨

Rudd-O commented 11 months ago

Mang it's been two years. I hope this gets fixed some day. VHM functionality is core functionality, it should not be breaking DAV.