python-caldav / caldav

Apache License 2.0
324 stars 98 forks source link

`NotImplementedError: remove the double slashes` when getting events (UID with '//') #143

Closed Feandil closed 3 years ago

Feandil commented 3 years ago

Hi!

While trying to debug support within OX of events with UIDs containing a / (URLs), I triggered the following error (XXXX & XXX were replaced by me):

>>> cal.events()
DEBUG:caldav:sending request - method=REPORT, url=https://XXXX/dav/caldav/XXX/, headers={'User-Agent': 'Mozilla/5.0', 'Content-Type': 'application/xml; charset="utf-8"', 'Accept': 'text/xml, text/calendar', 'Depth': '1'}
body:
<?xml version='1.0' encoding='utf-8'?>
<C:calendar-query xmlns:D="DAV:" xmlns:C="urn:ietf:params:xml:ns:caldav"><D:prop><C:calendar-data/></D:prop><C:filter><C:comp-filter name="VCALENDAR"><C:comp-filter name="VEVENT"/></C:comp-filter></C:filter></C:calendar-query>
DEBUG:urllib3.connectionpool:https://XXX:443 "REPORT /dav/caldav/XXX/ HTTP/1.1" 207 None
DEBUG:caldav:response headers: {'Date': 'Tue, 13 Apr 2021 12:48:59 GMT', 'Server': 'Openexchange WebDAV', 'X-Robots-Tag': 'none', 'Content-Type': 'text/xml; charset=UTF-8', 'Transfer-Encoding': 'chunked'}
DEBUG:caldav:response status: 0
DEBUG:caldav:b'<D:multistatus xmlns:D="DAV:" xmlns:APPLE="http://apple.com/ns/ical/" xmlns:CAL="urn:ietf:params:xml:ns:caldav" xmlns:CS="http://calendarserver.org/ns/">\n  <D:response>\n    <D:href>/dav/caldav/XXX/https%3A%2F%2Fexample.com%2Fabcdef.ics</D:href>\n    <D:propstat>\n      <D:prop>\n        <calendar-data xmlns="urn:ietf:params:xml:ns:caldav">BEGIN:VCALENDAR\nVERSION:2.0\nPRODID:-//Open-Xchange//7.10.4-Rev20//EN\nBEGIN:VEVENT\nDTSTAMP:20210413T124744Z\nCLASS:PUBLIC\nCREATED:20210413T124744Z\nDTEND:20210427T103000Z\nDTSTART:20210427T063000Z\nLAST-MODIFIED:20210413T124744Z\nSEQUENCE:0\nSUMMARY:Example event\nTRANSP:OPAQUE\nUID:https://example.com/abcdef\nPRIORITY:5\nEND:VEVENT\nEND:VCALENDAR</calendar-data>\n      </D:prop>\n      <D:status>HTTP/1.1 200 OK</D:status>\n    </D:propstat>\n  </D:response>\n</D:multistatus>\n'
DEBUG:caldav:b'<?xml version="1.0" encoding="UTF-8"?>\r\n<D:multistatus xmlns:D="DAV:" xmlns:APPLE="http://apple.com/ns/ical/" xmlns:CAL="urn:ietf:params:xml:ns:caldav" xmlns:CS="http://calendarserver.org/ns/">\r\n  <D:response>\r\n    <D:href>/dav/caldav/XXX/https%3A%2F%2Fexample.com%2Fabcdef.ics</D:href>\r\n    <D:propstat>\r\n      <D:prop>\r\n        <calendar-data xmlns="urn:ietf:params:xml:ns:caldav">BEGIN:VCALENDAR\r\nVERSION:2.0\r\nPRODID:-//Open-Xchange//7.10.4-Rev20//EN\r\nBEGIN:VEVENT\r\nDTSTAMP:20210413T124744Z\r\nCLASS:PUBLIC\r\nCREATED:20210413T124744Z\r\nDTEND:20210427T103000Z\r\nDTSTART:20210427T063000Z\r\nLAST-MODIFIED:20210413T124744Z\r\nSEQUENCE:0\r\nSUMMARY:Example event\r\nTRANSP:OPAQUE\r\nUID:https://example.com/abcdef\r\nPRIORITY:5\r\nEND:VEVENT\r\nEND:VCALENDAR</calendar-data>\r\n      </D:prop>\r\n      <D:status>HTTP/1.1 200 OK</D:status>\r\n    </D:propstat>\r\n  </D:response>\r\n</D:multistatus>\r\n'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.9/site-packages/caldav/objects.py", line 995, in events
    return self.search(root, comp_class=Event)
  File "/usr/lib/python3.9/site-packages/caldav/objects.py", line 757, in search
    (response, objects) = self._request_report_build_resultlist(xml, comp_class)
  File "/usr/lib/python3.9/site-packages/caldav/objects.py", line 742, in _request_report_build_resultlist
    if self.url.join(url) == self.url:
  File "/usr/lib/python3.9/site-packages/caldav/lib/url.py", line 71, in __eq__
    me = self.canonical()
  File "/usr/lib/python3.9/site-packages/caldav/lib/url.py", line 142, in canonical
    raise NotImplementedError("remove the double slashes")
NotImplementedError: remove the double slashes

The steps to reproduce are non trivial:

According to what I can read in rfc5545, UIDs are "text" for which // seems to be legit, but caldav/lib/url.py doesn't seem to like that...

(side note: amusingly, trying to save this eveng with caldav triggers a PUT on /dav/caldav/XXX/https%3A//example.com/abcdef.ics which itself triggers a 404 on our local OX installation, so support for // in UID doesn't seem great in general...)

tobixen commented 3 years ago

ouch, this is indeed a bug.

tobixen commented 3 years ago

I'm not able to think very straight those days due to a confirmed case of covid ... but I think that throughout the caldav library, the default URL for a calendar event is {url_to_calendar}/{uid}.ics - which of course breaks when there are slashes in the uid. If I've understood the RFCs correct, it's totally up to the client what filename to give when saving a calendar event. So the solution may be to find the place in the code where the URL is made up, and run some algorithm to make the uuid into some legit unique file name. Or perhaps use some uuid generated by the library as the file name, if the given uuid contains invalid characters. What do you think?

Feandil commented 3 years ago

Ow, I'm sorry to learn about your health issues. Take care of yourself! I don't think that this bug is urgent (and with an obvious work around: not using URLs as UIDs... ;) )

After digging around (looking e.g. at the davx5 implementation & the library used there), I think I see two bugs, but I'm not 100% sure...

  1. When saving a new calendar, {url_to_calendar}/{uid}.ics fails if {uid} is weird enough, esp. if it contains / (I wonder what other cases could trigger interesting behaviour (e.g. uid set to @example.com/avc?). As you said, I think (with my limited knowledge and experience of CalDAV) there are two options here, both relying on validating the UUID and if it's weird enough, pick one of:
    • Escape somehow the dangerous characters
    • Use a deterministic hash (or equivalent) of the UID (risk of collision limited if long enough)
  2. When trying to read a saved calendar (e.g. pushed from other clients) that contain URL-encoded uids, the library decode and interpret that part. I think it should just be left as-is (but I don't know if it can break other implementations)
mhthies commented 3 years ago

I have a similar problem while trying to sync existing iCal data to a Nextcloud CalDAV server:

Events with any number of slashes in their UID result in an HTTP 404 error due to the additional slashes in the URL of the PUT request.

After digging around in the code of python-caldav, I found out the urllib's quote() is already being used for creating the PUT URL. However, it's used with its default parameters, i.e. safe="/", i.e. slashes are not escaped. I experimentally changed that to safe="" and ran into new problems, since my Apache Webserver (or something of the mod_proxy or mod_proxy_fcgi), working as a Reverse-Proxy in front of my Nextcloud server, obviously does not like single-encoded slashes in the request URL. It returned an HTTP 404 before even passing the request to Nextcloud.

I ended up doing double-URL-escaping slashes in the UID ('/' ⇒ %252F) and normal URL-escaping for all the other reserved characters, by adding a simple .replace('/', '%2F') to the CalendarObjectResource._create() method:

        if path is None:
            path = quote(id.replace('/', '%2F')) + ".ics"

This solves the issue in my case.

tobixen commented 3 years ago

I don't see any issues with double-quoting slashes there.

Still, once again I seem unable to wrap my head completely around this issue ... I feel a bit uncertain if the fix I just pushed is a complete fix that solves all kind of weird characters in the UID (another rare corner case that may need handling: extremely long UIDs ... and perhaps most important, cases where the quoted UID matches some existing file name on the server - that could potentially be a security flaw), or if it's just a partial fix that solves issues with slashes in the UID.

mhthies commented 3 years ago

I don't see any issues with double-quoting slashes there.

I must admit, that I didn't read into the CalDAV standards yet, with respect to the actual meaning of the URLs and the requirements imposed on them.

If they need to be collision-free between different calendar objects (i.e. different UIDs) the double-quoting of (only) the slashes (as I demonstrated above), will violate that in certain cases: It results in the (probably rare) collission between a UID with an already quoted slash and a similar UID with a literal slash in the same location.

tobixen commented 3 years ago

I think I will just call it a day on this issue and close it, even if it possibly ought to be researched more.