globalbioticinteractions / carvalheiro2023

GloBI configuration to help index Luisa Carvalheiro, José Augusto Salim, Filipi Soares, Debora Drucker. 2023. WorldFAIR pilot data from: VisitationData_Luisa_Carvalheiro.
0 stars 0 forks source link

export annotation on a specific record to web annotation format #4

Open jhpoelen opened 9 months ago

Filipi-Soares commented 9 months ago

Hey folks. I've been thinking about what we talked yesterday. If we go on that direction of representing the interactions as web annotations, maybe a model like this could work:

{ "@context": "http://www.w3.org/ns/anno.jsonld", "id": "http://example.org/annotation123", "type": "Annotation", "motivatedBy": "interaction", "target": { "id": "http://example.org/plant123", "type": "Plant", "label": "Plant Species Name" }, "body": { "id": "http://example.org/animal123", "type": "Animal", "label": "Animal Species Name" }, "relationship": "visited flower of", "created": "2023-11-28T12:00:00Z", "generator": { "id": "http://example.org/researcher123", "type": "Person", "name": "Researcher Name" }, "time": "2023-11-28T11:30:00Z", "location": "Geolocation Data" }

Filipi-Soares commented 9 months ago

The IDs could be hashes @jhpoelen @zedomel

Filipi-Soares commented 9 months ago

'created' is the date when the interaction was recorded on the system; 'time' and 'location' refer to the interaction itself.

zedomel commented 9 months ago

@Filipi-Soares can we use RDF triple like that one:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix dwc: <http://rs.tdwg.org/dwc/terms/>.
@prefix ro:  <http://purl.obolibrary.org/obo/>.

<http://rebipp.org.br/occ/REBIPP:OCC:00101>
   a dwc:Occurrence;
   ro:RO_0002623 <http://rebipp.org.br/occ/REBIPP:OCC:00102>.

Here a complete example in turtle using PPI vocab and Darwin Core Semantic Web (DwC-SW):

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix dwc: <http://rs.tdwg.org/dwc/terms/>.
@prefix dwciri: <http://rs.tdwg.org/dwc/iri/>.
@prefix dsw: <http://purl.org/dsw/>.
@prefix ro:  <http://purl.obolibrary.org/obo/>.

<http://rebipp.org.br/event/REBIPP:PPI:00001>
    a dwc:Event;
    dwc:eventDate "2000-01-05"^^xsd:dateTime;
    dwc:eventTime "-86.298055"^^xsd:dateTime;
    dsw:locatedAt <http://rebipp.org.br/location/REBIPP:LOC:0001>;
    dcterms:relation <http://rebipp.org.br/mof/REBIPP:MOF:0001>.

<http://rebipp.org.br/location/REBIPP:LOC:0001>
    a dcterms:Location;
    dwc:decimalLatitude "-47.0680352"^^xsd:decimal;
    dwc:decimalLongitude "-22.8261888"^^xsd:decimal;
    dwc:geodeticDatum "EPSG:4326";
    dwc:country "Brazil";
    dwciri:inDescribedPlace <https://sws.geonames.org/9257717>.

<http://rebipp.org.br/organism/REBIPP:IND:00001>
    a dwc:Organism;
    dsw:hasOccurrence <http://rebipp.org.br/occ/REBIPP:OCC:00101>;
    dsw:hasIdentification <http://rebipp.org.br/id/REBIPP:ID:0001>.

<http://rebipp.org.br/organism/REBIPP:IND:00002>
    a dwc:Organism;
    dsw:hasOccurrence <http://rebipp.org.br/occ/REBIPP:OCC:00102>;
    dsw:hasIdentification <http://rebipp.org.br/id/REBIPP:ID:0002>.

<http://rebipp.org.br/occ/REBIPP:OCC:00101>
   a dwc:Occurrence;
   dsw:atEvent <http://rebipp.org.br/event/REBIPP:PPI:00001>;
   dcterms:relation <http://rebipp.org.br/mof/REBIPP:MOF:0002>;
   dcterms:relation <http://rebipp.org.br/mof/REBIPP:MOF:0003>;
   ro:RO_0002623 <http://rebipp.org.br/occ/REBIPP:OCC:00102>.

<http://rebipp.org.br/occ/REBIPP:OCC:00102>
   a dwc:Occurrence;
   dsw:atEvent <http://rebipp.org.br/event/REBIPP:PPI:00001>;
   dcterms:relation <http://rebipp.org.br/mof/REBIPP:MOF:0004>;
   ro:RO_0002622 <http://rebipp.org.br/occ/REBIPP:OCC:00101>.

<http://rebipp.org.br/id/REBIPP:ID:0001>
   a dwc:Identification;
   dwciri:toTaxon <https://www.gbif.org/species/5293403>.

<http://rebipp.org.br/id/REBIPP:ID:0002>
   a dwc:Identification;
   dwciri:toTaxon <https://www.gbif.org/species/1341976>.

<http://rebipp.org.br/mof/REBIPP:MOF:0001>
   a dwc:MeasurementOrFact;
   dwciri:measurementType <http://rs.rebipp.org.br/ppi/terms/resourceCollected>;
   dwc:measurementType "resourceCollected"^^xsd:string;
   dwc:measurementValue "pollen"^^xsd:string.

<http://rebipp.org.br/mof/REBIPP:MOF:0002>
   a dwc:MeasurementOrFact;
   dwciri:measurementType <http://rs.rebipp.org.br/ppi/terms/habit>;
   dwc:measurementType "habit"^^xsd:string;
   dwciri:measurementValue <http://purl.obolibrary.org/obo/FLOPO_0900033>;
   dwc:measurementValue "whole plant arborescent"^^xsd:string.

<http://rebipp.org.br/mof/REBIPP:MOF:0003>
   a dwc:MeasurementOrFact;
   dwciri:measurementType <http://rs.rebipp.org.br/ppi/terms/flowerLongevity>;
   dwc:measurementType "flowerLongevity"^^xsd:string;
   dwc:measurementValue "168"^^xsd:decimal.

<http://rebipp.org.br/mof/REBIPP:MOF:0004>
   a dwc:MeasurementOrFact;
   dwciri:measurementType <http://rs.rebipp.org.br/ppi/caste>;
   dwc:measurementType "caste"^^xsd:string;
   dwc:measurementValue "worker"^^xsd:string.
Filipi-Soares commented 9 months ago

@zedomel this model in RDF is much more expressive, I like it a lot. The direction of the interaction is given in the set of attributes of each occurrence by RO properties, just make sense!

jhpoelen commented 9 months ago

Nice! Please also compare with the model recently used by @tkuhn in https://github.com/globalbioticinteractions/globalbioticinteractions/issues/923#issuecomment-1829848372 .

Would be nice to at least have a conversation around this, especially as this may build a bridge to publishers like Pensoft @lyubomirpenev and IOS Press via https://knowledgepixels.com

Filipi-Soares commented 9 months ago

@jhpoelen @zedomel I was wondering if we should do a test on this whole data transformation process with one dataset... We could take Carvalheiro dataset, @jhpoelen could generate all the necessary hashes, and I can use RDF Lib or OpenRefine to generate the RDF file based on this model @zedomel presented. What do you think?

jhpoelen commented 9 months ago

@Filipi-Soares I much like to idea of experimenting with packaging of original data, their derived data products, and a description of how they are connected.

Can you please be more specific about your idea? What are the workflow steps you have in mind? What would the derived products look like? How'd you cite the data package?

Thanks for being patient with me as I am trying to understand your idea.

Filipi-Soares commented 9 months ago

Hey @jhpoelen :) It is me who should be thanking you for your patience, I have difficulties writing down my ideas :flushed: So, regarding the Data package citation, I think we should add a property (metadata_id) in this model Salim presented to connect the data to its metadata record. We should include the data citation in the metadata, what do you think? IDK if it works, but we can talk more about it.

About the workflow, my initial thought was to take the datasets, upload them to Globi, generate the hashes for the animal occurrence, plant occurrence, event, and any other data points that need an ID; and then, back to the spreadsheet, add these hashes as the IDs for these things. With everything filled out, I can convert these sheets to RDF. Anyway, I don't know if it is possible to implement this, so maybe we can talk more about it at our meeting on Monday?