AAFC-BICoE / dina-planning

AAFC-DINA planning repository
3 stars 2 forks source link

Record collecting location #19

Closed cgendreau closed 2 years ago

cgendreau commented 4 years ago

EDIT: This ticket changed a lot since its creation so the scope is now restricted to verbatim collecting locality

Should be able to enter the “location” of the a collecting event using the coordinates and/or locality. The location can include min/max elevation.

DarwinCore Terms in scope:

Source: https://github.com/DINA-Web/dina-model-concepts/issues/15

This ticket is about recording the exact "location" of a collecting-event and should not be confused with tools that could be used to get the information from elsewhere (gazetteer).

For georeferencing activity, I would add:

Linked to #66

dshorthouse commented 3 years ago

Are you also considering these as part of the location:

dshorthouse commented 3 years ago

Also require a representation of a transect. A start decimalLatitude and decimalLongitude as well as an end decimalLatitude and decimalLongitude.

dshorthouse commented 3 years ago

Does this also include all the water body terms in DwC such as:

dshorthouse commented 3 years ago

The issue at hand here seems to be about context and linkability (my new word). The context here is a Collecting Event with its date and collector(s). This means there will be all kinds of "location" possibilities here ranging from transects, to historic place names, to named collecting sites, to present-day geopolitical entities (inclusive of waterbodies) that are likely to have geographic coordinates, shapes, and hierarchies. So... the question is, "How much information about location needs to be uniquely present here in Collecting Event to faithfully express the context & how much information can be held elsewhere within DINA and merely linked to (1:many?) from here?" Is it sufficient to have nothing more than a verbatim locality string (or many verbatim locality strings), perhaps along with any other textual representation of locality as physically written on the medium, inclusive of typed lats/longs (that might actually be incorrect)? Anything over and above that is subject to (more) interpretation & is best accomplished through linking to an external definition.

So...

In the interim while we await the time to build Sites, Location, Gazetteer...the whole or part home(s) for that external definition...what do we put here as a stop-gap, especially if we anticipate migrating any structured data here from legacy sources? Do we borrow from managed attributes?

heathercole commented 3 years ago

@cgendreau there is overlap/confusion between this issue and issue #66. I thought this issue was about data associated with the collection event and related 'site description' whereas ticket #66 was more about "the place in the world where this thing was found". I have asked the managers for feedback on #66 which I interpret as being part of/within the larger context of collection event/site description. I will provide their feedback asap.

This issue has been identified as a priority, can you explain what info/feedback/requirements you need? (feel free to wait until I add info on the other issue)

cgendreau commented 3 years ago

Is it sufficient to have nothing more than a verbatim locality string (or many verbatim locality strings), perhaps along with any other textual representation of locality as physically written on the medium, inclusive of typed lats/longs (that might actually be incorrect)?

I would say that for the first iteration we could indeed focus on that.

That should probably capture everything related to a Location on a label.

heathercole commented 3 years ago

@cgendreau happy to discuss/clarify. I don't think we are on the same page here/using the same language. The fields (and important other related fields) that you list are part of the feedback provided for ticket #66 .

the fields you have listed do not include several vital fields relating to information from specimen labels.

cgendreau commented 3 years ago

Yes, that's why I specified for the first iteration, it is step by step and this would be the first step.

dshorthouse commented 3 years ago

This iteration is about Collecting Events. What this means is sharp focus on what is uniquely stored here, found nowhere else and consequently immune from interpretation made by anyone now or at any point in the future. There's no question that there are a heap of DwC terms that relate to the location piece of what it is a Collecting Event, but they are ALL subject to interpretation. What we seek to do is cleanly separate the immutable bits from ephemeral, interpreted bits that will experience flux through all the future marching of geopolitics. And so, the pure location bits in Collecting Event is the sacrosanct, faithfully typed information about location as the collector expressed it such that we can go back to it time & time again & gauge whether or not any or all of the interpretations made by those who did the georeferencing are presently correct. We may think that modern GPS units used in contemporary collecting events do away with all this, but surprisingly, these too are subject to future interpretation. A future iteration will address where the interpreted bits should be housed, where they get drawn from, and how these will relate to (= link to) a Collecting Event. But, they are peripheral to the core of what is a Collecting Event in a collections management system: someone collected something at a place they described at a specific point in time they recorded. This is VERY different from, "Me as collection manager declares that this is the person I know to be the collector, this is where I think they were on the planet, and this is when I think they were there doing their collecting." Although you're looking at the labels to make your statements, there's no discounting the possibility that you might be dead wrong.

dshorthouse commented 3 years ago

Suggest renaming this ticket to "Record verbatim collecting locality"

cgendreau commented 3 years ago

But we agree it's locality including fields I mentioned in https://github.com/AAFC-BICoE/dina-planning/issues/19#issuecomment-769099870 ?

dshorthouse commented 3 years ago

I do 100%, FWIW.

Perhaps outstanding is: Are there other verbatim things we want to help faithfully describe this locality as it pertains to this collecting event that may/may not be covered by DwC. Habitat, meteorological conditions, expedition names, names of research vessels, disjunct durations (?), are all verbatim and may need thought or we kick 'em down the lane for later.

jmacklin commented 3 years ago

I do as well.

I also agree that we need either one additional generic verbatim field to capture "other" information associated with locality encompassing the examples David has above or to decide on a small subset of verbatim fields that would capture this information. A question here is whether we need structured fields in addition to the verbatim ones for some of these cases. Best example is habitat, which is a DwC term. I think that a verbatim field can be used to represent habitat but perhaps down the road we may want to separate out key components into more structured (or at least parsed) fields related to "associated species" or abiotic characteristics versus biotic ones... Also, in the case like expeditions and exsiccate there are often sets of duplicated information on the label like the name of the expedition or an identifier (PAN737) that can have value to search across in order to facilitate discovery (all specimens from the Lewis and Clarke Expedition) or do rapid data capture (e.g., duplication of some or all of the locality information). I had this functionality in the first CMS I developed in Access, 20 something years ago as we had lots of duplicative stuff :-)

jmacklin commented 3 years ago

Also meant to say that in zoological collections, including entomology, the locality information (at least the more detailed info) is often stored in ledgers/catalogues and NOT associated with the label (think very small labels on pins). This does not change the data capture requirements but does change potential workflows where locality/site information may be captured asynchronously from other information.

heathercole commented 3 years ago

i guess in DwC terms locality AND verbatimLocality https://dwc.tdwg.org/terms/#location

jmacklin commented 3 years ago

Yes, in some ways but for capturing the information we may only require the verbatim locality information and then a limited set of structured fields that decompose it that bring value to searching, mapping and ultimately data sharing. Also, remember that DwC is an exchange schema so when we want to share our data with the world we would map to it (in the case of the portal and DwC-As) but within the CMS we don't have to be constrained by it and can be more or less detailed as required. Of course, logically, we would want to stay as consistent with it as we can to avoid having to transform it later...

cgendreau commented 3 years ago

Created an issue (#107) that will focus only on the verbatim part as a first step.