novoid / Memacs

What did I do on February 14th 2007? Visualize your (digital) life in Org-mode
GNU General Public License v3.0
1.03k stars 66 forks source link

Correct timezone handling for ical #108

Closed camdez closed 3 years ago

camdez commented 3 years ago

Properly handles DATEs and DATE-TIMEs in all formats supported by the iCalendar specification and the X-WR-TIMEZONE convention commonly used by vendors like Google.

DATE-TIMEs are converted to the local timezone when when Memacs runs since org-mode doesn't support timezones.

Additionally:

camdez commented 3 years ago

I thought this was ready to go but it strikes me that hashes probably shouldn't drift based on local time changes so that a calendar can be regenerated in a different local time zone and entries can match up.

Also this makes me realize that I don't know much about what these hashes are actually used for and I've just been assuming, so I'll dig in more and circle back shortly.

If they're used to match up headlines across changes, it seems like we'd ideally just use the underlying event IDs from the iCalendar file.

camdez commented 3 years ago

Ok, having taken a beat here to review the docs for how the hashes work, I see that my approach was misguided. I think we should just use the VEVENTS's UID property:

Description: The "UID" itself MUST be a globally unique identifier. The generator of the identifier MUST guarantee that the identifier is unique.

Backwards compatibility will suffer, but things will get MUCH better going forward. I'm open to input on that, but the old IDs would change if the event summary changed AT ALL, which is hardly a reliable method.

novoid commented 3 years ago

Ok, having taken a beat here to review the docs for how the hashes work, I see that my approach was misguided. I think we should just use the VEVENTS's UID property:

Description: The "UID" itself MUST be a globally unique identifier. The generator of the identifier MUST guarantee that the identifier is unique.

Backwards compatibility will suffer, but things will get MUCH better going forward. I'm open to input on that, but the old IDs would change if the event summary changed AT ALL, which is hardly a reliable method.

AFAIR we introduced the hashes mainly for providing the ability to append only new data from the input and ignore data that correspondents to entries whose hash value is already present.

While this makes perfect sense for other modules such as banking statements, it is not easy to define the data that makes a calendar entry "unique" since the description may be changed without changing the timeslot and vice versa. Therefore, you can decide how to handle this situation. I may be also fine with omitting UUIDs because I don't see much value in them when I think of it. Exception: the UUIDs are always the same for each appointment (I haven't read the details of its origin).

What do you suggest?

camdez commented 3 years ago

@novoid Ooooh. Do I understand correctly that if existing data shares the same hash then that new data will be skipped entirely (not updated with the new data)?

novoid commented 3 years ago

@novoid Ooooh. Do I understand correctly that if existing data shares the same hash then that new data will be skipped entirely (not updated with the new data)?

Imagine one large banking statement CSV file. Your bank only allows you to download your entire history at once, no detla files, queries for time-frames or monthly files.

Now you've got the problem of converting this data to Org files. Either you throw away everything with every Memacs invocation, building everything from scratch or you do find a method to determine "what is already known?" so that you can ignore those entries, just appending previously unknown entries.

Therefore, Memacs framework offers functionality to define parts of your data such as CSV columns date + description to be part of a hash. Assuming that date + description do not change in-between CSV downloads for old data, you get a unique identifier for existing data you can ignore when appending to your existing Org file.

My point is that it is hard to define such data points for appointments since changing the subject line seems to be a common operation for me as well as moving an appointment to a different time slot. As long as the original data source does not provide something unique to a single entry, this idea of having hashed/UUIDs/... for recognizing already known entries when being in append-mode is dead to my opinion.

If this holds true, append mode does not make sense for ical data. But maybe you've got a good idea how to tackle this. To me personally, I don't care since I do not use append more for ical.

camdez commented 3 years ago

@novoid That makes perfect sense. Thank you for the explanation, and apologies that I haven't had much time to work on this lately.

In cases like this, given that events have identities (and are mutable), an update mode probably makes sense. Agree that append mode doesn't make that much sense for this case since events could be updated on the calendar but those updates would be skipped in the append.

I think the way I have this implemented now (where the hashes are based on the event IDs) probably is the best one at the moment, but it's arguable and probably fairly negligible since not using append mode feels like the right approach.

From my perspective, this PR is good to go. Let me know if I can help with anything, be that changes to match project conventions, explanations, tests, or documentation. The status quo (prior to this PR) is just completely unworkable for me because basically every event from every iCal feed I care about shows up at the wrong time; I took great pains to figure out how the various possible cases work and cover them all, so I'm feeling good about the quality of the result.

I know it's a fairly large amount of code changed, but you should find that the tests are pretty comprehensive.

novoid commented 3 years ago

Thanks @camdez for you effort! I wasn't affected by this but but your fix adds substantial value to the iCal module for non-trivial use. :+1:

novoid commented 3 years ago

I added a note to the documentation of the module to make it clear to the users of this module that append mode doesn't make much sense here at the moment.

camdez commented 3 years ago

Awesome, thanks @novoid!