bitfireAT / ical4android

Allows usage of iCalendar files with the Android calendar provider
GNU General Public License v3.0
19 stars 10 forks source link

Handling period durations of days with `T` prefix #74

Closed ArnyminerZ closed 1 year ago

ArnyminerZ commented 1 year ago

Context is in icsx5#100.

Changes

fixInvalidDayOffset

The logic for doing the replacements is:

  1. Searches for all the matches in the string of the Regexp -?PT-?\d+D, this is all the matches that start optionally with a -, followed by PT, then optionally a - (this is because the negative indicator can be before PT or in the number properly), followed by a number, and the letter D.
  2. Iterates for each match.
    1. Get the starting and end positions of the match.
    2. Search the position of the offset number in the string.
    3. Convert that number to a long.
    4. Multiply it by 24.
    5. Replace the found match by the prefix given (can be PT or -PT) followed by the amount of hours found, and then H.

Extra considerations

Right now all matches of the "PT pattern" are replaced. This might cause issues with descriptions or other fields that for any reason have that format. Maybe we should list all the properties that have the Duration type, which as far as I'm concerned are:

rfc2822 commented 1 year ago

Thanks! I have added some preprocessor generalization.

Could you please change the FixInvalidDayOffsetPreprocessor accordingly (see FIXME) and add tests similar to FixInvalidUtcOffsetPreprocessorTest?

ArnyminerZ commented 1 year ago

I really like the generalization. Take a look at the tests just in case I've missed something, or tell me if you prefer to do them in another way 😉

ArnyminerZ commented 1 year ago

Mmmh... I'm not sure that the implementation really works out. It does for individual strings, but since we are processing the whole string, something like this:

Only the wrong data should be updated DURATION:-PT1D sure

Would not be fixed.

Removing ^ and $ should do the trick, I'll check now

Edit: is does seem to work however with ICalPreprocessorTest.testFixInvalidDurationTPrefixOffset so I guess Regex.findAll does indeed find this because the string is passed in different lines, and regex inspects them line by line.

rfc2822 commented 1 year ago

Mmmh... I'm not sure that the implementation really works out. It does for individual strings, but since we are processing the whole string, something like this:

Only the wrong data should be updated DURATION:-PT1D sure

Would not be fixed.

Removing ^ and $ should do the trick, I'll check now

Which is what we want. When someone writes the above text as a SUMMARY (for instance when this issue would be converted to an iCalendar 😄), the text shouldn't change. It should only change for DURATION and TRIGGER properties. With ^(DURATION|TRIGGER):…$ we can make sure that only lines beginning with DURATION or TRIGGER will be modified.

Edit: is does seem to work however with ICalPreprocessorTest.testFixInvalidDurationTPrefixOffset so I guess Regex.findAll does indeed find this because the string is passed in different lines, and regex inspects them line by line.

I think this is what RegexOption.MULTILINE does.

Two other things:

  1. What about getting the time string using parentheses and Regex groupValues? Then we wouldn't have to juggle with substrings and String indexes.
  2. I have mentioned it in a previous message that I have already deleted, because I was not sure after writing the message. But: the duration of a day is not by definition 24 hours. See https://www.rfc-editor.org/rfc/rfc5545#section-3.3.6:

The duration of a week or a day depends on its position in the calendar. In the case of discontinuities in the time scale, such as the change from standard time to daylight time and back, the computation of the exact duration requires the subtraction or addition of the change of duration of the discontinuity.

With DST, one day in the year has 23 hours, and another day has 25 hours. This is relevant for calculating events, and I think it's the reason why the "T" is not used for days and weeks in the iCalendar duration syntax.

So I'd suggest to rewrite PTxD to PxD instead of PT<x*24>H.

ArnyminerZ commented 1 year ago

So I'd suggest to rewrite PTxD to PxD instead of PT<x*24>H.

Yup, I think this would be the easiest approach. Also simplifies parsing.