Open nickevansuk opened 3 years ago
Task list:
[x] The validator should include a rule that validates that a Schedule contains a valid iCal recurrence rule.
[x] Using the rrule, validate that all exceptDate values are actually part of the recurrence defined by the rrule, and produce a WARNING if any are outside of the rrule.
[x] validate that if a PartialSchedule includes startTime or endTime, it must also include scheduleTimezone, producing a FAILURE otherwise
[x] validate that idTemplate and urlTemplate include a {startDate} placeholder, and are valid .
[x] A separate generic rule (#377) that checks the values are valid Uri Templates
[x] A specific rule that targets idTemplate and urlTemplate and checks that they both include {startDate}, producing a FAILURE otherwise
[ ] validate that scheduledEventType is a valid Event subclass.
[x] validate that repeatCount is a positive integer greater than zero
[x] validate that scheduleTimezone matches a value from the IANA Time Zone Database.
@thill-odi @nickevansuk
I've been looking at the RRule
validation today and wanted to ask a couple questions. There is very little information required to generate a valid RRule
. I think fundamentally it is just a frequency that is required (iCal spec). The requirement for a Schedule from schema.org.
Do we want to raise any warnings or failures in addition to the these requirements, for fields such as startDate
/startTIme
, or duration? If my understand is correct these fields could surface in other locations in the data? (e.g. courseInstance example
@Lathrisk so we need enough data to generate at least one occurrence when ignoring exceptDate
- my understanding is that many properties are only valid in certain value combinations - so perhaps one could try to simply generate the first occurance in the schedule and see if it's possible? So assume in this case yes it will need startDate
and startTime
If the schedule does not generate any occurrences it should be a FAILURE
All data used to generate a Schedule
must come from within the Schedule
only; it is entirely self-contained.
Also, to help with:
Using the rrule, validate that all exceptDate values are actually part of the recurrence defined by the rrule, and produce a WARNING if any are outside of the rrule.
Potential logic as follows for the exceptDate
check:
exceptDate
Here's the schedule generation code that imin uses, complete with a working timezone implementation:
https://gist.github.com/lukehesluke/b00a20caf22abf26b38c0c5e3238e150
Note specifically that process.env.TZ
must be set when the validator starts in order for RRule to behave as expected in all environments
You won't need all of the code above, but it should give you a good head start here :)
Thanks, @nickevansuk. I had started using rrule with luxon so will take a look at this as a more predictable alternative.
@nickevansuk (cc @thill-odi) I've been working through the timezone code, but I have been trying to understand the reasoning for implementing this as part of this rule. Surely the primary requirement is ensuring that the time, date, and timezone are provided, rather than the specific implementation of the RRule generation that we apply here? The generated recurrence rule is a byproduct of the recommendations in the developer docs, rather than concrete values in the OpenActive data that we need to validate?
Is there something important that I am missing?
Not sure I'm following @Lathrisk - the only reason for the timezone's need to be accurate is for the exceptDate
validation (as per https://github.com/openactive/data-model-validator/issues/365#issuecomment-806237716).
So perhaps I've misunderstood? Also should have mentioned previously that if using luxon is something that gets around needing to set process.env.TZ
that sounds great (as setting that globally isn't ideal for a library) (unless this was something @lukehesluke already tried when working on the Gist above?)
Thanks @nickevansuk, that does make sense wrt to the exceptDate
's. I think what hasn't been clear to me is the format of the exceptDate
field. Or I had made an assumption about it'd format.
Is this always in UTC? I notice that the examples provide the closing Z
but I couldn't find that in the documentation. I had previously assumed that would be listed in the same timezone as the scheduledTimezone
field.
If I understand correctly then we are generating the RRule with the 'correct' UTC dates so that we can compare this to the exceptDates which are in UTC?
unless this was something @lukehesluke already tried when working on the Gist above?
I attempted to use the tzid/Luxon solution specified in their docs here 👉 https://github.com/jakubroztocil/rrule#timezone-support but it's not machine-independent. In their example, they show that the tzid
is offset against the system timezone (which causes issues when running across machines which do not all share the same system timezone). It additionally has the same issue of not producing valid dates when considering the timezone designator. They are only valid if you ignore the timezone designator (which will be UTC if following rrule
s guidelines) and assume that the system's timezone is being used instead.
For example, it failed the "British Summer Time test", presumably because my local machine was already set to Europe/London
> var myRule = new RRule({
freq: RRule.WEEKLY,
interval: 1,
byweekday: [RRule.MO, RRule.FR],
dtstart: new Date(Date.UTC(2021, 2, 18, 7, 30)),
until: new Date(Date.UTC(2021, 3, 6)),
tzid: 'Europe/London'
});
> myRule.all();
[ 2021-03-19T07:30:00.000Z,
2021-03-22T07:30:00.000Z,
2021-03-26T07:30:00.000Z,
// this should be 2021-03-29T06:30:00.000Z (i.e. the UTC version of 2021-03-29T07:30:00.000+01:00)
// as BST occurs on the 28th March. Same for the subsequent ones:
2021-03-29T07:30:00.000Z,
2021-04-02T07:30:00.000Z,
2021-04-05T07:30:00.000Z ]
as setting that globally isn't ideal for a library
Agreed! You might be able to use the tzid/luxon solution without requiring that process.env.TZ
is a special value by getting the resulting dates and then transforming them using whatever the current value for process.env.TZ
is.
e.g. something like (NOTE: I haven't actually tested this):
return datesWithWeirdTimezones.map(date =>
DateTime.fromJSDate(date)
.toUTC()
.setZone(process.env.TZ, { keepLocalTime: true }));
This could work if TZ
is guarantueed to always have a value
EDIT: I can confirm that process.env.TZ
is not guarantueed to always have a value - maybe there's another way of getting the system IANA timezone?
This issue actually likely contains a number of rules, and the various requirements below could be split into different rules in a number of ways. Generally it's better to keep rule files as focussed as possible, with a good separation of concerns between the validation being conducted by each file.
Validate recurrence rule from
Schedule
The validator should include a rule that validates that a
Schedule
contains a valid iCal recurrence rule.There's already schedule generation logic in the conformance services, so it should be easy to convert this into some validation logic for the same fields, creating an rrule.
Note the logic above does not include
scheduleTimezone
(which it definitely should), so this will need to be added.If there are not enough properties to create an rrule (e.g. the rrule library might return an error saying this?), display #292
If the properties do not create a valid rrule, return an error from the rrule library that explains what’s wrong, as a FAILURE
Using the rrule, validate that all
exceptDate
values are actually part of the recurrence defined by the rrule, and produce a WARNING if any are outside of the rrule.Also note this other rrule library that has better support for timezones, if the library used above is problematic https://www.npmjs.com/package/@rschedule/core - however it has much lower usage so the rrule library is preferred if possible.
The Bookwhen (http://data.bookwhen.com/) and Open Sessions (https://opensessions.io/openactive) feeds should contain good test data for this.
Validate specific properties within
PartialSchedule
(target onlyPartialSchedule
)PartialSchedule
includesstartTime
orendTime
, it must also includescheduleTimezone
, producing a FAILURE otherwiseValidate specific properties within
Schedule
(target onlySchedule
)validate that
idTemplate
andurlTemplate
include a{startDate}
placeholder, and are valid .idTemplate
andurlTemplate
and checks that they both include{startDate}
, producing a FAILURE otherwisevalidate that
scheduledEventType
is a validEvent
subclass.Validate specific properties within
Schedule
andPartialSchedule
(target both)validate that repeatCount is a positive integer greater than zero
Schedule
andPartialSchedule
via the data-modelsvalidate that
scheduleTimezone
matches a value from the IANA Time Zone Database.scheduleTimezone
and uses e.g.moment.tz.names()
(which requires adding moment-timezone) to validate the entries, producing a FAILURE if the value does not match. Note this is a separate rule from the generic rrule validation as it also applies toPartialSchedule
.Also consider whether extending the below could be an approach to implementing any of the above:
See further documentation: