isamplesorg / isamples_inabox

Provides functionality intermediate to a collection and central
0 stars 1 forks source link

Example JSON has oc-gen:cat-site and oc-gen:cat-square, but those fields aren't present in the OpenContext JSON we fetch #303

Closed dannymandel closed 9 months ago

dannymandel commented 10 months ago

In the example JSON here: https://github.com/isamplesorg/metadata/blob/develop/examples/OpenContext/test1.0Valid/ark-28722-k28d0b21r-v1.json

the keywords have scheme_name of oc-gen:cat-site and oc-gen:cat-square. However, those fields aren't present in the JSON we retrieve from this URL: https://opencontext.org/query/.json?attributes=ALL-STANDARD-LD&cat=oc-gen-cat-sample-col%7C%7Coc-gen-cat-bio-subj-ecofact%7C%7Coc-gen-cat-object&response=metadata%2Curi-meta&sort=updated--desc&type=subjects&rows={OPENCONTEXT_PAGE_SIZE}.

They are, however, present in a more detailed JSON page for each record (e.g. https://opencontext.org/subjects/f423496d-c695-46da-97c5-aacc89553f69.json).

We either can't use those schemes, or we need to change the way we fetch the records.

dannymandel commented 10 months ago

@smrgeoinfo @datadavev @ekansa, I need some guidance here.

dannymandel commented 10 months ago

It does look like we have pieces of it available in this field:

 "context label": "Europe/Cyprus/Polis Chrysochous/E.F2:R09",

but we don't have the scheme for each of those pieces.

dannymandel commented 10 months ago

It does look like we could grab the last piece of the context label and use the context href as the uri, e.g.

        {
            "keyword": "E.F2:R09",
            "keyword_uri": "https://opencontext.org/subjects/9d20d284-1cc2-4381-8940-0de1bfc10d87",
            "scheme_name": "???"
        }

but I don't know what we would use for scheme_name there, because there isn't an indication of what it is.

dannymandel commented 10 months ago

This is an example of a recent record we've fetched in the updated OpenContext format:

 {
 "uri": "http://opencontext.org/subjects/a5c171f9-9403-4e55-8f64-adf1047b703d", 
 "href": "https://opencontext.org/subjects/a5c171f9-9403-4e55-8f64-adf1047b703d",
 "icon": "https://opencontext.org/static/oc/icons-v2/object-icon-draft-2.svg", 
 "label": "Reg. 697", 
 "Creator": ["Joanna Smith"], 
 "License": ["Attribution 4.0 International (CC BY 4.0)"], 
 "updated": "2022-10-23T07:15:31Z", 
 "latitude": 35.034889, 
 "longitude": 32.421841, 
 "published": "2017-01-30T22:57:28Z", 
 "Consists of": ["glass (material)"], 
 "late bce/ce": 1000.0, 
 "citation uri": "https://n2t.net/ark:/28722/k26h4xk1f", 
 "context href": "https://opencontext.org/subjects/9d20d284-1cc2-4381-8940-0de1bfc10d87", 
 "early bce/ce": -800.0, 
 "project href": "https://opencontext.org/projects/766d9fd5-2175-41e3-b7c9-7eba6777f1f0", 
 "Creator [URI]": ["http://opencontext.org/persons/6c34c167-1a30-4820-956d-474c73c07085"], 
 "License [URI]": ["https://creativecommons.org/licenses/by/4.0"], 
 "context label": "Europe/Cyprus/Polis Chrysochous/E.F2:R09", 
 "item category": "Object", 
 "project label": "Excavations at Polis", 
 "Consists of [URI]": ["https://vocab.getty.edu/aat/300010797"], 
 "inorganic material": ["glass (material)"], 
 "inorganic material [URI]": ["https://vocab.getty.edu/aat/300010797"], 
 "inorganic material [getty-aat-300010360]": ["glass (material)"], 
 "inorganic material [getty-aat-300010360] [URI]": ["https://vocab.getty.edu/aat/300010797"]
 }
datadavev commented 10 months ago

It may be simplest in the long run to have a transform of the detail record into the iSamples model, and integrate that transform with the OC service (if acceptable). That way OC could emit records ready for iSamples and all that would be needed is exposure of a sitemap for those records to facilitate harvest.

smrgeoinfo commented 10 months ago

Danny, the recent record looks most useful. I'll looking at the open context docs/web site aout their query export options to see what might bepossible.

dannymandel commented 10 months ago

Eric says he can include this in the OC API.

ekansa commented 9 months ago

Hi all, here's revision to the API on our staging site. The API now makes easier to manage nested objects that are hopefully a bit more consistent: https://staging.opencontext.org/query/.json?attributes=iSamples&cat=oc-gen-cat-sample-col%7C%7Coc-gen-cat-bio-subj-ecofact%7C%7Coc-gen-cat-object&response=metadata%2Curi-meta&sort=updated--desc&type=subjects&rows=75

Here's an example record from that revised API. Note the inclusion of an iSamples "sampling site":

{
            "label": "PC 19890104",
            "uri": "http://opencontext.org/subjects/ff040a52-d8df-4edd-2370-e5ca84749125",
            "href": "https://staging.opencontext.org/subjects/ff040a52-d8df-4edd-2370-e5ca84749125",
            "citation uri": "https://n2t.net/ark:/28722/k2251kj8d",
            "id": "http://opencontext.org/subjects/ff040a52-d8df-4edd-2370-e5ca84749125",
            "project": {
                "label": "Murlo",
                "id": "http://opencontext.org/projects/df043419-f23b-41da-7e4d-ee52af22f92f"
            },
            "context": {
                "label": "Europe/Italy/Poggio Civitate/Tesoro/Tesoro 26/1989, ID:134",
                "id": "http://opencontext.org/subjects/c2df07ac-7fbd-47f6-4d82-328bada0cc8d"
            },
            "latitude": 43.152648338284486,
            "longitude": 11.402015740135283,
            "early bce/ce": -700.0,
            "late bce/ce": -535.0,
            "item category": "Object",
            "icon": "https://staging.opencontext.org/static/oc/icons-v2/object-icon-draft-2.svg",
            "thumbnail": "https://iiif.archivelab.org/iiif/opencontext-24-89-104abcjpg/full/150,/0/default.jpg",
            "published": "2012-12-28T00:00:00Z",
            "updated": "2023-07-26T08:24:25Z",
            "isam:SamplingSite": {
                "label": "Poggio Civitate",
                "identifier": "https://opencontext.org/subjects/871b9ef8-bc68-4190-5f8a-00882c0040a4",
                "id": "https://opencontext.org/subjects/871b9ef8-bc68-4190-5f8a-00882c0040a4"
            },
            "Consists of": [
                {
                    "label": "Impasto (pottery)",
                    "id": "https://en.wikipedia.org/wiki/Impasto_(pottery)"
                }
            ],
            "Has type": [
                {
                    "label": "sling bullet",
                    "id": "https://vocab.getty.edu/aat/300432860"
                }
            ],
            "Subject": [
                {
                    "label": "Architecture",
                    "id": "https://id.loc.gov/authorities/subjects/sh85006611"
                },
                {
                    "label": "Human settlements",
                    "id": "https://id.loc.gov/authorities/subjects/sh85062894"
                },
                {
                    "label": "Subsistence economy",
                    "id": "https://id.loc.gov/authorities/subjects/sh85129537"
                },
                {
                    "label": "Archaeology",
                    "id": "https://id.loc.gov/authorities/subjects/sh85006507"
                },
                {
                    "label": "Civilization, Etruscan",
                    "id": "https://id.loc.gov/authorities/subjects/sh85045470"
                }
            ],
            "Coverage": [
                {
                    "label": "Iron age",
                    "id": "https://id.loc.gov/authorities/subjects/sh85068153"
                }
            ],
            "Temporal Coverage": [
                {
                    "label": "Orientalizing (750 BCE - 582 BCE)",
                    "id": "https://n2t.net/ark:/99152/p06v8w45852"
                },
                {
                    "label": "Roman Imperial (31 CE - 399 CE)",
                    "id": "https://n2t.net/ark:/99152/p06v8w43gnx"
                },
                {
                    "label": "Archaic (580 BCE - 482 BCE)",
                    "id": "https://n2t.net/ark:/99152/p06v8w4g9dz"
                }
            ],
            "Creator": [
                {
                    "label": "Anthony Tuck",
                    "id": "http://opencontext.org/persons/61d87033-881e-48b9-ff27-27337bbcdaa0"
                }
            ],
            "License": [
                {
                    "label": "Attribution 4.0 International (CC BY 4.0)",
                    "id": "https://creativecommons.org/licenses/by/4.0"
                }
            ]
        }
dannymandel commented 9 months ago

@smrgeoinfo Could you take a look at this new format and offer suggestions on how to best utilize the data that @ekansa is offering iSamples through the new API? Presumably the new isam:SamplingSite key would be what we use for oc-gen:cat-site. I'm not sure I see anything that would map to oc-gen:cat-square, though. Also, it looks like we should be including this information somewhere:

            "Subject": [
                {
                    "label": "Architecture",
                    "id": "https://id.loc.gov/authorities/subjects/sh85006611"
                },
                {
                    "label": "Human settlements",
                    "id": "https://id.loc.gov/authorities/subjects/sh85062894"
                },
                {
                    "label": "Subsistence economy",
                    "id": "https://id.loc.gov/authorities/subjects/sh85129537"
                },
                {
                    "label": "Archaeology",
                    "id": "https://id.loc.gov/authorities/subjects/sh85006507"
                },
                {
                    "label": "Civilization, Etruscan",
                    "id": "https://id.loc.gov/authorities/subjects/sh85045470"
                }
            ],

Those look like natural fits for our keywords field. What do you think?

smrgeoinfo commented 9 months ago

Danny -- I'm back from vacation and will go over that ASAP

ekansa commented 9 months ago

Hi All,

I'm a little confused about the interest in a "square" entity. Is there an expectation to surface to iSamples much more granular / specific sampling location information than a "sampling site"?

I can do that. But some of these entities will be stratigraphic units (specific deposits of dirt) and some will be "squares", "contexts", "trenches", "survey tracts" and all sorts of other kinds of contexts, depending on the recording system of the investigator.

dannymandel commented 9 months ago

I'm not 100% sure, but the oc-gen:cat-square example is here: https://github.com/isamplesorg/metadata/blob/a59d9b35062643928f868f85da5b32bb02a6b357/examples/OpenContext/test1.0Valid/ark-28722-k28d0b21r-v1.json#L39

Showing up in the iSamples keywords

smrgeoinfo commented 9 months ago

That is just a suggestion, but I bet Eric is correct that no one is going to search for samples by 'square'; that information can be included in the produced_By/sampling_site/description. It might be useful for someone evaluating the sample for their interests.

smrgeoinfo commented 9 months ago

as far as the keywords... @dannymandel your example has the content, but should be serialized like this according to the JSON schema

            "keywords:": [
                {
                    "keyword": "Architecture",
                    "keyword_uri": "https://id.loc.gov/authorities/subjects/sh85006611"
                },
                {
                    "keyword": "Human settlements",
                    "keyword_uri: "https://id.loc.gov/authorities/subjects/sh85062894"
                },
                {
                    "keyword": "Subsistence economy",
                    "keyword_uri": "https://id.loc.gov/authorities/subjects/sh85129537"
                },
                {
                    "keyword": "Archaeology",
                    "keyword_uri": "https://id.loc.gov/authorities/subjects/sh85006507"
                },
                {
                    "keyword": "Civilization, Etruscan",
                    "keyword_uri": "https://id.loc.gov/authorities/subjects/sh85045470"
                }
            ],
dannymandel commented 9 months ago

Yeah, what I included was the OC JSON. I'll get that part and include in our keywords.

dannymandel commented 9 months ago

Thanks for the clarifications, @smrgeoinfo and @ekansa!