DOREMUS-ANR / marc2rdf

Converter from UNIMARC/INTERMARC to RDF using the DOREMUS model
Apache License 2.0
6 stars 0 forks source link

Artist birth and death dates #57

Closed pasqLisena closed 7 years ago

pasqLisena commented 7 years ago

How we describe artists birth-death now:

<http://data.doremus.org/artist/6963af5e-b126-3d40-a84b-97e0b78f5452>   
        ecrm:P98i_was_born  "1770"^^xsd:gYear ;
    ecrm:P100i_died_in  "1827"^^xsd:gYear .

Problem: P98i and P100i should have as range an event (it is an ObjectProperty and not a DataProperty). So the connection between the artist and its birth date should be Artist > Birth Event > Time Span > Date literal. Very long.

Different proposals:

  1. bio:birth it is a ObjectProperty (used with errors by BnF)
  2. schema:birthDate
  3. dbo:birthDate
  4. ... others?
pierrechoffe commented 7 years ago

True, the birth event is a bit longer, but it allows for richer descriptions : date of birth, place of birth, from parents X & Y, notes about birth, etc. I made a schema about it a while ago, although it is simplified (no Time-Span step which should definitely be there) it gives a good idea of possible ramifications.

And, since we are in a dynamic model, why not use event-based descriptions rather than regressing to static, attribute-like descriptions ?

pasqLisena commented 7 years ago

And, since we are in a dynamic model, why not use event-based descriptions rather than regressing to static, attribute-like descriptions ?

What if put both for allowing both simpler and richer queries? :)

Good schema anyway

pierrechoffe commented 7 years ago

What if put both for allowing both simpler and richer queries? :)

what would be the schema in this case ?

pierrechoffe commented 7 years ago

@pasqLisena Btw, when I did this schema I was interested in one additional point, which relates to our discussion: the possibility to infer the nationality of a person depending on informations about birth and death places. In this (fake) case, the person was born Russian but died Ukrainian after Ukraine was independent. So maybe this is one more argument in favor of the Birth and Death events ? ;)

pasqLisena commented 7 years ago

what would be the schema in this case ?

Considering also #56 :

<http://data.doremus.org/artist/6963af5e-b126-3d40-a84b-97e0b78f5452>   
        schema:birthDate    "1770"^^xsd:gYear ;
    schema:deathDate    "1827"^^xsd:gYear ;
        ecrm:P98i_was_born <http://data.doremus.org/artist/6963af5e-b126-3d40-a84b-97e0b78f5452/birth> ;
    ecrm:P100i_died_in  [ #similar ] .

<http://data.doremus.org/artist/6963af5e-b126-3d40-a84b-97e0b78f5452/birth>
        a ecrm:E67_Birth ;
        ecrm:P7_took_place_at <http://sws.geonames.org/524901/> ;
        ecrm:P4_has_time-span <http://data.doremus.org/artist/6963af5e-b126-3d40-a84b-97e0b78f5452/birth/time> .

<http://data.doremus.org/artist/6963af5e-b126-3d40-a84b-97e0b78f5452/birth/time>
    a ecrm:E52_Time-Span, tl:Interval;
    rdfs:label "1770" ;
    tl:start "1770"^^xsd:gYear ;
    tl:end "1770"^^xsd:gYear .
delahousse commented 7 years ago

+1 for Pasquale example

rtroncy commented 7 years ago

Hum, 2 small observations:

Your rationale is that it makes simpler queries, but with SPARQL 1.1 property paths, this is not quite right. With your shortcut:

select ?p ?b
where {
  ?p schema:birthDate ?b
}

Without your shortcut:

select ?p ?b
where {
  ?p ecrm:P98i_was_born/ecrm:P4_has_time-span/time:hasBeginning/time:inXSDDate ?b
}

In practice, given that there are not many options after each node, the SPARQL query evaluation plan should roughly give the same thing.

pasqLisena commented 7 years ago

The difference between these 2 queries is that the former is much more readable :) I am also thinking about other people that would like to use our data. Moreover, we sometimes duplicate the information, like with U70_has_title and rdfs:label. And it is instantiated at the same time by the converter, so there will be not incoherence. I am still for putting them both if this is the only point against.

rtroncy commented 7 years ago

Right, but people are not supposed to see queries, nor raw data :-)

I'm fine though that we also make an exception of duplicating those values between the schema:birthDate (resp. schema:deathDate) and the longer ecrm path as we do between rdfs:label and mus:U70_has_title.

@nicolasGuillouet Do you watch this issue? This has consequences on your converter too! @pasqLisena Can you implement those changes and close the issue afterwards?