lobid / lodmill

This repo is replaced by i.a. https://github.com/hbz/lobid-resources/
19 stars 8 forks source link

Missing subfield `d` (birth/death date) for person subjects #771

Closed acka47 closed 8 years ago

acka47 commented 8 years ago

From https://github.com/lobid/lodmill/issues/766#issuecomment-165767182.

Example HT016671946 (snippet):

<datafield tag="902" ind1="-" ind2="1">
<subfield code="p">Jérôme</subfield>
<subfield code="c">Westfalen, König</subfield>
<subfield code="d">1784-1860</subfield>
<subfield code="9">(DE-588)118557432</subfield>
</datafield>

Current JSON:

{
    "@id" : "http://d-nb.info/gnd/118557432",
    "preferredName" : "Jérôme, Westfalen, König",
    "preferredNameForThePerson" : "Jérôme, Westfalen, König"
  }
dr0i commented 8 years ago

Once we had the birth/death dates included in the data. We had a discussion about it, see https://github.com/hbz/lobid/issues/244, and removed them . What to do now? Try to provide an extra field under the GND resource?

jschnasse commented 8 years ago

+1 for the extra field, similar to gnd rdf http://d-nb.info/gnd/118557432/about/lds

acka47 commented 8 years ago

Once we had the birth/death dates included in the data. We had a discussion about it, see https://github.com/hbz/lobid/issues/244, and removed them

The discussion in https://github.com/hbz/lobid/issues/244 was about birth dates for contributors. This is about birth dates for person as subjects. As we can easily separate them in the source data we should leave out birth dates for the first but add them to the second. No need for an extra field...

dr0i commented 8 years ago

What's with the proposal of @jschnasse ? I like that it separate things, and its even way easier to transform the data into that structure (in fact, I've finished that just now:) )

The discussion in hbz/lobid#244 was about birth dates for contributors. This is about birth dates for person as subjects.

Yes, but now see the following up discussion at https://github.com/lobid/lodmill/issues/742#issuecomment-151155100.

As we can easily separate them in the source data we should leave out birth dates for the first but add them to the second.

It's not that easy because on the one hand there are many rules which could be used by both subjects and creator/contributor and on the other there are a lot of specialized rules. This is also true for creator vs contributor. The complex morph reflects that mess. (I've to admit that simplifying things on data level not only brings pros (by making the data more granular) but also with cons (by outsourcing the logic to the frontend). But as a conclusion, it makes definitely sense when dealing with multiple consumers because they may have different demands and its easier for them to deal with these when not concatenating strings but leave them in their respective key-value structure).

Btw, what's speaking against giving dates their own property?

acka47 commented 8 years ago

Thinking about it you are both right to have it in a separate field for API 2.0. The question is whether we should do this for all subfields and, thus, making our data more MARCy...

dr0i commented 8 years ago

Deployed , see e.g. http://gaia.hbz-nrw.de:9200/resources-staging/resource/HT016671946. Note that the birth/death dates are deliberately only added to persons which are subjects.

acka47 commented 8 years ago

Looks good – for persons as subjects. But I still think this can be improved. (Sorry.)

Note that the birth/death dates are deliberately only added to persons which are subjects.

I think it would be ok and make sense to also deliver birth dates of authors via the API. What customers asked for was not to show them in the UI – which can easily be implemented as long as the information is in separate fields.

dr0i commented 8 years ago

Now I agree that all things can be improved. And if you are willing to say " person subjects" (as this issues title actually is) are herewith be properly treated ...

dr0i commented 8 years ago

s/seperate/separate

acka47 commented 8 years ago

Now I agree that all things can be improved. And if you are willing to say " person subjects" (as this issues title actually is) are herewith be properly treated ...

I am tempted to change the issue's title back...However, +1 and we'll get on with it...

dr0i commented 8 years ago

you have a point there, but thx!

fsteeg commented 8 years ago

I'm assigned to display the life dates in NWBib? So for http://test.lobid.org/nwbib/HT016671946, 'Schlagwörter' should contain Jérôme, Westfalen, König (1784-1860)?

acka47 commented 8 years ago

I'm assigned to display the life dates in NWBib? So for http://test.lobid.org/nwbib/HT016671946, 'Schlagwörter' should contain Jérôme, Westfalen, König (1784-1860)?

Yes, but it doesn't have that much priority.

fsteeg commented 8 years ago

The current transformation for this in subject chains looks wrong to me:

http://nwbib.de/HT016671946 -> Jérôme (1784-1860), Westfalen, König

As for the subject part, shouldn't it be Jérôme, Westfalen, König (1784-1860)?

I seem to remember that it was like that before, could this be a regression @dr0i?

fsteeg commented 8 years ago

Deployed the nwbib part to staging: http://test.nwbib.de/HT016671946

If only birth is given: http://test.nwbib.de/HT018866836

dr0i commented 8 years ago

I seem to remember that it was like that before, could this be a regression @dr0i?

No I think it was ever deliberately so: "Name (birth dates), title etc.)". For me it also feels better to write the time period behind the person, because its the person's life dates. If it's written at the end like in Jérôme, Westfalen, König (1784-1860) I would assume that he was king for that period (well, unlikely semantically but structurally so). No ?

acka47 commented 8 years ago

+1 for NWBib and the subject chains. (It doesn't make sense to put more work in the subjectChain field as we will get rid of it anyway with API 2.0.)