gbif / model-tests

Exploration of sample models
2 stars 0 forks source link

Arctos Events #12

Open MortenHofft opened 2 years ago

MortenHofft commented 2 years ago

I'm trying to recreate https://arctos.database.museum/guid/DMNS:Mamm:11098

For the events box I do this below graphql query and have following notes. Some of it is just loose thoughts others simple obvious bugs.

I can mostly get anything I would need to replicate the UI with a few exceptions.

query {
  # the entity can have many IDs, so we need to ask for the entity through an entity identifiers table
  specimensIDs: allEntityIdentifiers(condition: {
    entityIdentifier: "https://arctos.database.museum/guid/DMNS:Mamm:11098"
  }) {
    # there should only be one entity with this ID
    nodes {
      entityId
      entityIdentifier
      entityIdentifierType
      specimen: entityByEntityId {
        entityId
        entityType

        # get the event data
        entityEventsByEntityId {
          totalCount
          nodes {
            eventByEventId {
              eventType
              # I'm unable to find the agent that appears in the arctos site
              # verification status is left out it seems
              # collection source: wild is not to be found either
              eventDate
              verbatimEventDate

              locationByLocationId {
                higherGeography
                locationAccordingTo # in the UI this is attached to the higher Geography which I think it also what the source is about
                locality
              }
              verbatimLocality
              #Associated Names - i do not see the data anywhere. Perhaps left out?

              locationByLocationId {
                georeferencesByLocationId { # I'm surprised to get back a list of georeferences for my location. How Do I choose?
                  nodes {
                    decimalLatitude
                    decimalLongitude
                    preferredSpatialRepresentation # I guess this is primary_spatial_data: point-radius
                    geodeticDatum
                    coordinateUncertaintyInMeters
                    georeferenceSources # why is this plural?
                    georeferenceProtocol
                  }
                }
                minimumElevationInMeters
                maximumElevationInMeters
              }
              verbatimLatitude # filled with collectionMethod
              verbatimLongitude # filled with habitat

              # collectionMethod not there # "verbatimLatitude": "Sherman trap" - I see it in latitude though
              habitat # not filled 

              # list of images from event/place - those I cannot figure out how to get to. See https://github.com/timrobertson100/model-tests/issues/11

            }
          }
        }
      }
    }
  }
}
tucotuco commented 2 years ago

The verifying agent and status attached to the collection event were not mapped to our model.

The collection source is indeed missing. That is the equivalent of dwc:establishmentMeans. The best place for it conceptually would be an Assertion on the intersection of the Organism and the Event, i.e., an EntityEventAssertion. However, to avoid creating a primary key for EntityEvent and adding another table just for this purpose, a simplification could be to just add establishmentMeans to EntityEvent directly. Added in https://github.com/timrobertson100/model-tests/commit/1539d590c79b69e51635370fabe87563ca767687.

The locationAccordingTo was mapped incorrectly. It really does refer to the higher geography only in arctos, not to the whole location. Fixed in https://github.com/timrobertson100/model-tests/commit/f6dcf8bd117cb3dbede00e1c8e960845e3e31c90.

We do not have the associated geography names in our exports from Arctos. Fine to ignore.

The separation of Georeference from Location in the model is partly for aesthetic convenience (so as not to make a giant table), but also georeferences are spatial interpretations of Locations, and as there can be multiple interpretations, I chose to show the model that way. Internally Arctos does have multiple georeferences per location, but in practice, I would not expect them to share other than the accepted one. Also in practice, none of the examples in the data have more than one georeference. We have a couple of options. One is to add a boolean column for isAcceptedGeoreference in keeping with the isAcceptedIdentification for Identifications, and set them all to True. Another is to change the cardinality in the model to one-to-one between Location in Georeference.

The mapping issue that affected verbatimLatitude, verbatimLongitude, collectionMethod, and habitat were all fixed in https://github.com/timrobertson100/model-tests/commit/f29f35e172897b9ca977720890abf0ef9596d5ee.

The issue of missing images for Locations is addressed in issue #11.

MortenHofft commented 2 years ago

Thank you.

I'm not sure I get how the locationAccordingTo is to be used, but that is clearly a detail that could be solved by documentation.

Multiple georeferences. I'm not sure what the better solution is. I do not understand the needs well enough. Why are there many? Is it opinions? And if so should it have an Agent as well as the isAccepted flag?

tucotuco commented 2 years ago

Thank you.

I'm not sure I get how the locationAccordingTo is to be used, but that is clearly a detail that could be solved by documentation.

Is the Darwin Core documentation on dwc:locationAccordingTo insufficient?

Multiple georeferences. I'm not sure what the better solution is. I do not understand the needs well enough. Why are there many? Is it opinions? And if so should it have an Agent as well as the isAccepted flag?

The need within Arctos is to track the history of opinions and sources. There might be a GEOLocate georeference that gets "improved" someone and then corrected by the collector. Arctos cares about this as they might an Identification history. As I was saying, I don't think at the level of aggregation that this is a high priority, and sharing just the "accepted" one is probobly going to be sufficient, at least until someone requests otherwise. As with Identifications, Georeferences are really just sophisticated Assertions, and both have the Agents built into them (identifiedBy, georeferencedBy).