NationalMuseumAustralia / Collection-API

The public web API of the National Museum of Australia
10 stars 0 forks source link

Empty production dates #60

Closed staplegun closed 5 years ago

staplegun commented 6 years ago

646 production dates have a note but no date data, which are failing to load into Solr as date-type fields rejects empty data. The conversion to Solr needs to ignore empty date data fields.

  <ProductionDates>
   <ProductionDate>
    <ProDateNotes_tab>Not provided</ProDateNotes_tab>
   </ProductionDate>
  </ProductionDates>
Conal-Tuohy commented 6 years ago

These data should probably be typed in the SPARQL layer too (and checked at that stage). Assuming they are in a standard form rather than free text.

staplegun commented 6 years ago

Found a related use case, where there is only an earliest encoded date, but no default AssDate0 (object IRN 116731):

<AssociatedDates>
   <AssociatedDates>
    <AssLatestDate0>1953</AssLatestDate0>
    <AssDateType_tab>Period of use</AssDateType_tab>
   </AssociatedDates>
</AssociatedDates>
staplegun commented 6 years ago

Don't want to use AssDate0 as is often duplicates the earliest date, which is not accurate way of describing the whole date for this thing.

date: "11 Jul 2001 - 16 Aug 2017"
earliestDate: "2001-07-11"
latestDate: "2017-08-16"

ignoring AssDate0, just use earliest/latest if no earliest/latest, copy AssDate0 into them

If just one of earliest/latest, just use that

date: "- 16 Aug 2017"
latestDate: "2017-08-16"
nmamanager commented 6 years ago

This approach is confirmed.

With one question - if the xml includes the date type, it would also be good to include this, as this explains the date. For example: Period of use.

nmamanager commented 6 years ago

And some sample records for when it comes to testing:

Date range example: 227892 Latest date only: 116731 Earliest date only: 70136 Exact dates (i.e. no date range): 64620