ThreeSixtyGiving / standard

The 360Giving data standard for UK philanthropic giving
http://www.threesixtygiving.org
Other
10 stars 15 forks source link

Proposal: explicit date or date-time formats for uncertain dates #220

Closed BobHarper1 closed 6 years ago

BobHarper1 commented 6 years ago

Proposal

An update to the schema to be explicit that dates in this group can be formatted as one of:

Full year-month-day 'the letter T' hour:minute:second Timezone

yyyy-mm-ddThh:mm:ss<timezone>

<timezone> should be replaced with: 'Z' for UTC, or ±hh:mm for timezones other than UTC

e.g. 2017-10-18T15:01:23Z

OR:

Full year-month-day

yyyy-mm-dd

e.g. 2017-10-18

OR:

Full year-month

yyyy-mm

e.g. 2017-10

OR:

Full year

yyyy

e.g. 2017

Why make this change?

This group of dates publishers with a way of providing dates to users where the publisher may have a range of dates, from partial dates to full dates and times. A good example of this is a 'planned date' whereby someone knows the month something is likely to start or end, but doesn't yet know exactly.

In the schema, such dates are handled through the Event object. In effect this covers:

Currently, the schema expects a date-time but documentation states:

“Dates should be in YYYY-MM-DD format. If the month or day are not available, these may be omitted.”

This causes confusion for publishers when they publish their grants information based on the schema.

Specifying that uncertain dates can be used, and how they can be used, helps data publication.

Impact

This will require a change to the JSON schema. Anyone relying on the current version of the schema will be affected. However, any effects are likely to be small, as the change is only about how dates may be formatted. Anyone following the schema would have allowed date-times, this will still be allowed, but they will now also be able to allow the shortened date format as well.

This will also require a change to the documentation both within the schema and elsewhere to communicate the options and the rationale behind them. Publishers should check that they are complying with this requirement of the standard and seek help if they need it.

There is a potential to confuse data users with mixed formats in the fields, though this is balanced against accuracy of information from publishers. For this, we propose consultations with analysts and developers using 360Giving data to understand how they use this information and the impact the change will have. We propose the changes above are progressed and feedback from the consultation is incorporated into the next review round for the Standard.

There is potential for confusion about actualDates. Actual Dates, should probably/really be known dates. We can see the case whereby e.g. historical records may only record a year and month for example, but actual dates are really intended to give exactly that information.

The schema change will mean that validators will find it easy to check that it is one of the acceptable formats, but additional steps would need to be taken to see if actualDates are full dates (YYYY-MM-DD).

You can feedback to this discussion in the topic on the 360 Giving Community Forum.

Racheler commented 6 years ago

This is a sensible approach for publishers that don't collect more detailed date-time data. We'll need to look at how best to explain this in the guidance so we make clear that its only for organisations that don't have more detail, i.e. so we don't reduce ambition of releasing detailed date-time data.

morchickit commented 6 years ago

So it only covers plannedDates and actualDates? How many people use them now? I believe that not many at all to begin with...

BobHarper1 commented 6 years ago

More than I'd presumed are using plannedDates, around 38 datasets have them (there might be slightly fewer unique publishers in that). Only one is using actualDate.

As we're giving a wide range of valid formats, I think that most if not nearly all will already be using one of them.

stevieflow commented 6 years ago

Checking what the current situation is on this, we have the following descriptions for startDate and endDate (both of which are available via plannedDates and actualDates:

startDate All events should have a start date. Dates should be in YYYY-MM-DD or date-time format. If the month or day are not available, these may be omitted.

endDate Events or activities lasting more than one day should have either a duration (in months) or an end date. Dates should be in YYYY-MM-DD or date-time format. If the month or day are not available, these may be omitted.

This description now seems acceptable in line with this proposal. There is not exact specification on the dateTime, but we have inclusion of this.

--> @Bjwebb @morchickit @KDuerden

Bjwebb commented 6 years ago

Yes, I think this issue is now done.