jelmer / xandikos

A CalDAV/CardDAV server backed by Git
https://www.xandikos.org/
GNU General Public License v3.0
423 stars 42 forks source link

category search produces an internal server error ... #216

Closed tobixen closed 2 years ago

tobixen commented 2 years ago

Now I have a calendar with those three events:

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Example Corp.//CalDAV Client//EN
BEGIN:VEVENT
SUMMARY:Our Blissful Anniversary
DTSTART;VALUE=DATE:19971102
DTSTAMP:19970901T130000Z
UID:19970901T130000Z-123403@example.com
RRULE:FREQ=YEARLY
CATEGORIES:ANNIVERSARY,PERSONAL,SPECIAL OCCASION
CLASS:CONFIDENTIAL
TRANSP:TRANSPARENT
END:VEVENT
END:VCALENDAR

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Example Corp.//CalDAV Client//EN
BEGIN:VEVENT
SUMMARY:Bastille Day Party
DTSTART:20060714T170000Z
DTEND:20060715T040000Z
DTSTAMP:20060712T182145Z
UID:20010712T182145Z-123401@example.com
END:VEVENT
END:VCALENDAR

BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Example Corp.//CalDAV Client//EN
BEGIN:VEVENT
SUMMARY:Bastille Day Jitsi Party
DTSTART:20210714T170000Z
DTEND:20210715T040000Z
DTSTAMP:20210712T182145Z
UID:20080712T182145Z-123401@example.com
END:VEVENT
END:VCALENDAR

... and doing this report query:

<C:calendar-query xmlns:D="DAV:" xmlns:C="urn:ietf:params:xml:ns:caldav">
  <D:prop>
    <C:calendar-data/>
  </D:prop>
  <C:filter>
    <C:comp-filter name="VCALENDAR">
      <C:comp-filter name="VEVENT">
        <C:prop-filter name="CATEGORIES">
          <C:text-match collation="i;octet">PERSONAL</C:text-match>
        </C:prop-filter>
      </C:comp-filter>
    </C:comp-filter>
  </C:filter>
</C:calendar-query>

... produces this traceback:

Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/web_protocol.py", line 422, in _handle_request
    resp = await self._request_handler(request)
  File "/usr/local/lib/python3.8/dist-packages/aiohttp/web_app.py", line 499, in _handle
    resp = await handler(request)
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/web.py", line 1476, in xandikos_handler
    return await main_app.aiohttp_handler(request, options.route_prefix)
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/webdav.py", line 2152, in aiohttp_handler
    response = await self._handle_request(request, environ)
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/webdav.py", line 2108, in _handle_request
    return await do.handle(request, environ, self)
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/webdav.py", line 1764, in handle
    return await reporter.report(
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/webdav.py", line 353, in wrapper
    async for resp in req_fn(self, environ, *args, **kwargs):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/caldav.py", line 566, in report
    async for (href, resource) in webdav.traverse_resource(
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/webdav.py", line 1205, in traverse_resource
    for (child_name, child_resource) in members_fn(resource):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/web.py", line 636, in calendar_query
    for (name, file, etag) in self.store.iter_with_filter(filter=filter):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/store/__init__.py", line 358, in _iter_with_filter_indexes
    if filter.check_from_indexes(name, file_values):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/icalendar.py", line 832, in check_from_indexes
    if not child_filter.match_indexes(indexes, self.tzify):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/icalendar.py", line 630, in match_indexes
    if not child.match_indexes(subindexes, tzify):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/icalendar.py", line 630, in match_indexes
    if not child.match_indexes(subindexes, tzify):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/icalendar.py", line 728, in match_indexes
    if not child.match_indexes(subindexes):
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/icalendar.py", line 509, in match_indexes
    return any(
  File "/usr/local/lib/python3.8/dist-packages/xandikos-0.2.8-py3.8.egg/xandikos/icalendar.py", line 510, in <genexpr>
    self.match(self.type_fn.from_ical(k)) for k in indexes[None])
  File "/usr/local/lib/python3.8/dist-packages/icalendar/prop.py", line 283, in from_ical
    out = unescape_char(ical).split(',')
  File "/usr/local/lib/python3.8/dist-packages/icalendar/parser.py", line 34, in unescape_char
    assert isinstance(text, (compat.unicode_type, compat.bytes_type))
AssertionError
[18/Oct/2022:13:09:32 +0000] "REPORT /user/calendars/pythoncaldav-test/ HTTP/1.0" 500 231 "-" "Mozilla/5.0"

The weird thing here is that while I'm consistently able to reproduce this towards my existing calendar server running stand-alone on the latest HEAD of master (a43ba24a13ac5a8810cc13b031fe73d6505e1e95), I'm not able to reproduce it running the same test code but towards an "internal" xandikos server running the same revision.

I will do more research into this, though not sure if I will catch it today.

tobixen commented 2 years ago

After a restart of xandikos, the problem could not be reproduced anymore. Weird.

tobixen commented 2 years ago

So ... I restart xandikos, run the test, it passes, rerun the same test once more, and xandikos fails with a 500-error. This is consistent, same error at the third and fourth run. Restart xandikos ... and test passes again. Once. And fails at the second run.

I conclude that there is some state in xandikos here that should not be there.

jelmer commented 2 years ago

There is a cache which is used for performance improvements - it's probably related to that. It should be there but it looks like there's an issue around its deserilalization :)

tobixen commented 2 years ago

ed58e93b421b6aa3817211c7c9de49a9d1ca8c7b is the first bad commit

sorry. disregard that. That was one of the other issues messing up with my bisect

tobixen commented 2 years ago

873bd39354a7ecf4cad1660869df6f7a0995c5d9 is probably the first bad commit

Disregard that too. Different error traceback.

tobixen commented 2 years ago

9c13cfcb01683f126bc1760efc7781d9aaa0d723 is apparently the first commit giving me a traceback from the icalendar module.

tobixen commented 2 years ago

Here is the minimal test case that triggers the problem:

evr = """BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Example Corp.//CalDAV Client//EN
BEGIN:VEVENT
UID:19970901T130000Z-123403@example.com
DTSTAMP:19970901T130000Z
DTSTART;VALUE=DATE:19971102
SUMMARY:Our Blissful Anniversary
TRANSP:TRANSPARENT
CLASS:CONFIDENTIAL
CATEGORIES:ANNIVERSARY,PERSONAL,SPECIAL OCCASION
RRULE:FREQ=YEARLY
END:VEVENT
END:VCALENDAR"""

#(...)

    def testXandikosCacheIssue(self):
        c = self._fixCalendar()
        e3=c.save_event(evr)
        some_events = c.search(comp_class=Event, category="PERSONAL")
        some_events = c.search(comp_class=Event, category="PERSONAL")
        some_events = c.search(comp_class=Event, category="PERSONAL")
        some_events = c.search(comp_class=Event, category="PERSONAL")
        some_events = c.search(comp_class=Event, category="PERSONAL")
        some_events = c.search(comp_class=Event, category="PERSONAL")
        some_events = c.search(comp_class=Event, category="PERSONAL")

_fixCalendar does an MKCALENDAR, save_event does a PUT, and search does the REPORT with the XML shown in the top post.

Note that it's only that recurring event that causes problems - the other two events will cause the test code above to pass (ah ... probably because they don't include the category PERSONAL). It also consistently fails at the seventh repetition of the search.

tobixen commented 2 years ago

Hm. This also breaks:

    def testXandikosCacheIssue(self):
        c = self._fixCalendar()
        e3=c.save_event(evr)
        some_events = c.search(comp_class=Event, category="one")
        some_events = c.search(comp_class=Event, category="two")
        some_events = c.search(comp_class=Event, category="three")
        some_events = c.search(comp_class=Event, category="four")
        some_events = c.search(comp_class=Event, category="five")
        some_events = c.search(comp_class=Event, category="fix")
        some_events = c.search(comp_class=Event, category="SEVEN")

while this does not break (obviously because it has no category):

ev1 = """BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Example Corp.//CalDAV Client//EN
BEGIN:VEVENT
UID:20010712T182145Z-123401@example.com
DTSTAMP:20060712T182145Z
DTSTART:20060714T170000Z
DTEND:20060715T040000Z
SUMMARY:Bastille Day Party
END:VEVENT
END:VCALENDAR
"""

    def testXandikosCacheIssue(self):
        c = self._fixCalendar()
        e3=c.save_event(ev1)
        some_events = c.search(comp_class=Event, category="one")
        some_events = c.search(comp_class=Event, category="two")
        some_events = c.search(comp_class=Event, category="three")
        some_events = c.search(comp_class=Event, category="four")
        some_events = c.search(comp_class=Event, category="five")
        some_events = c.search(comp_class=Event, category="fix")
        some_events = c.search(comp_class=Event, category="SEVEN")
        some_events = c.search(comp_class=Event, category="eight")
        some_events = c.search(comp_class=Event, category="nine")
        some_events = c.search(comp_class=Event, category="ten")
        some_events = c.search(comp_class=Event, category="many")
        some_events = c.search(comp_class=Event, category="boring")

Very weird indeed.

(I also tried to search for other things than category, it also made no impact. Apparently it's always the seventh query for category that breaks. In the original case the calendar was even deleted and recreated before things broke).

I'm going afk now.

tobixen commented 2 years ago

The index manager obviously has a counter, so that popular stuff is cached while one-off data isn't.

(Pdb) self.index_manager.find_present_keys(necessary_keys)
(Pdb) self.index_manager.find_present_keys(necessary_keys)
(Pdb) self.index_manager.find_present_keys(necessary_keys)
(Pdb) self.index_manager.find_present_keys(necessary_keys)
(Pdb) self.index_manager.find_present_keys(necessary_keys)
(Pdb) self.index_manager.find_present_keys(necessary_keys)
(Pdb) self.index_manager.find_present_keys(necessary_keys)
['C=VCALENDAR/C=VEVENT/P=CATEGORIES']
(Pdb) self.index_manager.find_present_keys(necessary_keys)
['C=VCALENDAR/C=VEVENT/P=CATEGORIES']

I guess the question is not how this caching system works ... the question is more on the data put into it or taken out from it, it probably needs some kind of conversion somewhere.

tobixen commented 2 years ago

The objects in the cache seems to be ... objects, but the code is doing a vClass.from_ical(foo), which works when foo is a string. For vText this works well, since vText can easily be casted to a string.

jelmer commented 2 years ago

The icalendar interface is a bit odd; it needs to be calling vClass(vClass.from_ical(foo)) to work properly

jelmer commented 2 years ago

Ah, I think I found it. The first time the index is generated, it's use for filtering and not explicitly serialized.