novoid / Memacs

What did I do on February 14th 2007? Visualize your (digital) life in Org-mode
GNU General Public License v3.0
1.01k stars 66 forks source link

memacs/ical fails to parse valid iCalendar files & generally mishandles timezones #107

Closed camdez closed 3 years ago

camdez commented 3 years ago

I spent quite a while in the timezone trenches sorting out why Memacs didn't appear to generate correct timestamps when I imported iCalendar files.

Per the iCalendar specification (RFC5545), DTSTART and DTEND (the primary fields used to specify when events begin and end)...

The [TZID] parameter MUST be specified on properties with a DATE-TIME
value if the DATE-TIME is not either a UTC or a "floating" time.
Failure to include and follow VTIMEZONE definitions in iCalendar
objects may lead to inconsistent understanding of the local time at
any given location.

TZID (when present) contains a reference to an entry in the VTIMEZONE component of the same file, which defines the details of the time zone.

That said, with the introduction of CalDav and RFC7809, many CalDav servers ceased providing timezone data (viz. the VTIMEZONE component) because...

In many cases, these "VTIMEZONE" components can be larger, octet-wise,
than the events or tasks that make use of them.  However, iCalendar
currently requires all iCalendar objects ("VCALENDAR" components) that
refer to a time zone via its identifier to also include the
corresponding "VTIMEZONE" component.  This leads to inefficiencies in
the CalDAV protocol because large amounts of "VTIMEZONE" data are
continuously being exchanged, and for the most part these time zone
definitions are unchanging.  This is particularly problematic for
mobile or limited devices, with limited network bandwidth, CPU, and
energy resources.

And...

Observation and experiments have shown that, in the vast majority of
cases, CalDAV clients have typically ignored time zone definitions in
data received from servers, and instead make use of their own "built-
in" definitions for the corresponding time zone identifier.  This
means that it is reasonable for CalDAV servers to unilaterally decide
not to send "VTIMEZONE" components for standard time zones that
clients are expected to have "built-in" (i.e., IANA time zones).
Thus, in the absence of a "CalDAV-Timezones" request header field,
servers advertising the "calendar-no-timezone" capability MAY opt to
not send standard "VTIMEZONE" components.  Servers that do that will
need to provide an administrator configuration setting to override the
new default behavior based on client "User-Agent" request header field
values, or other suitable means of identifying the client software in
use.

As a result of these changes, many of the .ical files floating around in the real world, which users may wish to import via Memacs, don't conform to the original iCalendar specification since their timezone information was communicated out of band.

One commonly-used (e.g. Google Calendar), lightweight way to work around the issue of discarding timezone information is the X-WR-TIMEZONE extension, which simply contains a single (presumably) IANA Time Zone database identifier which (again, presumably) applies to all dates which don't otherwise specify a timezone, which would have otherwise been interpreted as "floating times":

They are used to represent the same hour, minute, and second value
regardless of which time zone is currently being observed.  For
example, an event can be defined that indicates that an individual
will be busy from 11:00 AM to 1:00 PM every day, no matter which time
zone the person is in.

The original authors of ical.py must have been working exclusively with such files because the parser requires the presence of the (non-standard) X-WR-TIMEZONE line. Considering the nature of this project, I'm inclined to be pragmatic rather than a stickler about the original specification (additional discussion here), but I definitely do think that compliant iCal files should be parseable!

What's more, the timezone conversion via X-WR-TIMEZONE just doesn't seem to work. Currently I'm in Pacific time (aka America/Los_Angeles), and importing the following simple calendar file...

BEGIN:VCALENDAR
VERSION:2.0
PRODID:manual
X-WR-TIMEZONE:Europe/Berlin
BEGIN:VEVENT
CLASS:PUBLIC
DTSTART:20181103T201500
DTEND:20181103T211500
UID:test.ics
DTSTAMP:20190127T140400
DESCRIPTION:date 2018-11-03, time 20:15 UTC+1, in-calendar VTIMEZONE as ret
 urned by http://tzurl.org/zoneinfo-outlook/Europe/Berlin, "Outlook" style
SUMMARY:date 2018-11-03, time 20:15 UTC+1, in-calendar VTIMEZONE as returne
 d by http://tzurl.org/zoneinfo-outlook/Europe/Berlin, "Outlook" style
END:VEVENT
END:VCALENDAR

Yields this:

## -*- coding: utf-8 mode: org -*-
## This file was generated by bin/memacs_ical.py. Any modification will be overwritten upon next invocation.
## To add this file to your list of org-agenda files, open the stub file (file.org) not this file (file.org_archive) within emacs and do following: M-x org-agenda-file-to-front
* Memacs for ical Calendars          :Memacs:calendar:
** <2018-11-03 Sat 20:15>--<2018-11-03 Sat 21:15> date 2018-11-03, time 20:15 UTC+1, in-calendar VTIMEZONE as returned by http://tzurl.org/zoneinfo-outlook/Europe/Berlin, "Outlook" style
   :PROPERTIES:
   :DESCRIPTION: date 2018-11-03, time 20:15 UTC+1, in-calendar VTIMEZONE as returned by http://tzurl.org/zoneinfo-outlook/Europe/Berlin, "Outlook" style
   :ID:          4340f5faaaca4ed68142dd51ffb066d235625acb
   :END:

* successfully parsed 1 entries by bin/memacs_ical.py at [2021-09-08 Wed 02:39] in ~0.002410s.

It looks as if the date is simply being interpreted as UTC.

From the comments in the source file, it looks like the authors may have mistakenly presumed all timestamps would either be in GMT, or all day events.

FWIW, the good news is that icalendar already handles all of the difficult work of working with VTIMEZONEs, but it does not handle X-WR-TIMEZONE, so a hybrid approach is likely necessary.

My recommendation would be to attempt to lean on icalendar's timezone handling, and only fall back to applying X-WR-TIMEZONE for floating events if X-WR-TIMEZONE exists. That approach should handle all (normal) standards-compliant files and all X-WR-TIMEZONE-containing files.

PR incoming shortly.

camdez commented 3 years ago

Resolved via 5429c96e25885547f24c430b710cbb59f4ea6c7a.