DiSSCo / SDR

Specimen Data Refinery
Apache License 2.0
6 stars 0 forks source link

Implement an openDS dynamic schema #27

Closed benscott closed 1 year ago

benscott commented 3 years ago

Building upon #26, OpenDS schema should evolve over time.

Allow mutiple versions/updating, with data versioning.

infinite-dao commented 2 years ago

I’m involved in 4.6 People Linkage for collectors etc.. So far I managed to write a python wrapper to add a collector by a command like so:

python 'main.py' -i ./test-data/open-ds-input.json -o ./test-data/open-ds-output.json --collector 'Linné'

As input/output I expect:

So far I contemplated and I came up with this:

"original_collector_search": {
    "results": [
        {
            "item": {
                "type": "uri",
                "value": "http://www.wikidata.org/entity/Q1043"
            },
            "personLabel": {
                "xml:lang": "en",
                "type": "literal",
                "value": "Carl Linnaeus"
            },
            "personAltLabels": {
                "type": "literal",
                "value": "Carl Linn\u00e6us; Carl Nilsson Linn\u00e6us; Carl von Linnaeus; Carl von Linne; Carl von Linn\u00e9; Caroli Linn\u00e6i; Carolo Linnaeo; Carolo Linn\u00e6o; Carolus Linnaeus; Carolus Linn\u00e6us; Carolus a Linne; Carolus a Linn\u00e9; Karl von Linn\u00e9; L.; Linn.; Linnaeus; Linne; Linn\u00e6us; Linn\u00e9"
            },
            "genderLabel": {
                "xml:lang": "en",
                "type": "literal",
                "value": "male"
            },
            "dateOfBirth": {
                "datatype": "http://www.w3.org/2001/XMLSchema#dateTime",
                "type": "literal",
                "value": "1707-05-23T00:00:00Z"
            },
            "dateOfDeath": {
                "datatype": "http://www.w3.org/2001/XMLSchema#dateTime",
                "type": "literal",
                "value": "1778-01-10T00:00:00Z"
            },
            "image": {
                "type": "uri",
                "value": "http://commons.wikimedia.org/wiki/Special:FilePath/Carl%20von%20Linn%C3%A9.jpg"
            },
            "numApiOrdinal": {
                "datatype": "http://www.w3.org/2001/XMLSchema#int",
                "type": "literal",
                "value": "0"
            },
            "VIAF_ID_URI": {
                "type": "uri",
                "value": "http://viaf.org/viaf/34594730"
            },
            "ISNI_ID": {
                "type": "literal",
                "value": "0000 0001 2127 4957"
            },
            "WIKIDATA_ID_URI": {
                "type": "uri",
                "value": "http://www.wikidata.org/entity/Q1043"
            },
            "occupations": {
                "type": "literal",
                "value": "professor; ornithologist; bryologist; pteridologist; autobiographer; botanist; mycologist; entomologist; zoologist; geologist; biologist; naturalist; physician"
            },
            "occupationsWithLangPrefix": {
                "type": "literal",
                "value": "en:professor; en:ornithologist; en:bryologist; en:pteridologist; en:autobiographer; en:botanist; en:mycologist; en:entomologist; en:zoologist; en:geologist; en:biologist; en:naturalist; en:physician"
            }
        },
        {...},
        {...}
    ],
    "summary": "3 results searching collector \u201cLinne\u201d: Carl Linnaeus; Carl Linnaeus the Younger; Johan Stensson Rothman."
}

Ideas behind this:

Questions to clarify:

Thank you for any clarifications Andreas

infinite-dao commented 2 years ago

I see: perhaps more suited on issues within https://github.com/DiSSCo/openDS

llivermore commented 2 years ago

See https://nsidr.org/#objects/20.5000.1025/ODStypeV0.2

llivermore commented 1 year ago

Implementing a schema dynamically would probably break workflows at some point. It would need to be done in a FAIR way and it's questionable whether we would want to (or could sensibly) implement a dynamic schema. NB: We have not updated the fields to the latest specification - first stable release is due after SYNTHESYS+ funding is over see also #18