mapping-manuscript-migrations / mmm-fuseki

0 stars 0 forks source link

Problems with date values? #2

Closed gklyne closed 3 years ago

gklyne commented 4 years ago

(I'm probably posting this in the wrong place, but I hope you'll be able to reclocate the issue to the right place.)

Background: I'm attempting to load the MMM data into a Blazegraph store using the BlazeGraph equivalent to TDB loader, using this Dockerfile recipe as a guide.

I'm getting most of the data to load, but the Blazegraph loader appears to be complaining about dates with single-digit day values; e.g.

Will load from: /mmm_data/mmm_bodley.ttl
Journal file: ./runtime-data/blazegraph.jnl
ERROR: LexiconConfiguration.java:707: -0029-01-0: value=-0029-01-0
ERROR: LexiconConfiguration.java:707: -0029-12-3: value=-0029-12-3
 :
(etc.)

The messages here aren't helpful, but as the values being loaded are datatyped as xsd:date, as in:

<http://ldf.fi/mmm/time/bodley_person_100172872_birth_timespan>
    ecrm:P82a_begin_of_the_begin "-0029-01-0"^^xsd:date ;
    ecrm:P82b_end_of_the_end "-0029-12-3"^^xsd:date ;
    mmms:decade -30 ;
    dct:source mmms:Bodley ;
    a ecrm:E52_Time-Span ;
    skos:prefLabel "-002 - -002" .

then the messages appear to be correctly identifying errors in the data.

Cf. "dd is a two-digit numeral that represents the day"

(I'm seeing similar errors in some of the other source files.)

DILewis commented 4 years ago

This looks like a correct xsd:date format that's been truncated in the expectation of a 10 character text field, when a BC text field is 11 characters long.

razz0 commented 3 years ago

Thanks for reporting this, indeed all the BC dates from the Bodley data were not handled correctly. This is fixed in the 2.1.0 version of the dataset.