Open dustymc opened 9 months ago
Are we mapping to DwC-A or GBIF's new model?
I'm interested in participating.
I am interested in participating, if y'all will have me!
Questions sent to @dbloom about publishing under the GBIF new model.
Resources https://www.gbif.org/new-data-model
- Is anyone actually doing this?
The implementation of the GBIF data model to date is experimental. We are going through distinct use cases that cover various different realms of biodiversity data. Material was the first one, and we feel that we learned enough to cover that realm fairly well, but the publishing model is not for implementation yet. GBIF is not in a position to consume and aggregate those data wholesale yet. It is not the calendar to do yet. Arctos is in a great position to be able to do it when it becomes enabled. They were at the core of the model design on two occasions.
- If so, where do I find the schema? I can find use cases and make guesses, but I feel like there must be some expected format similar to the DwC-A
Though it isn't ready to be implemented, the model as used in the Material Collections "experiment" is the closest to the eventual underlying model at GBIF as there is. I will expect Arctos will want to publish something close to this model, because the publishing model(s) will be simpler and would require unnecessary work to map to something less rich than what you see in that link.
- If Arctos decided to publish under the new model, would it be problematic for the VertNet IPT?
I expect that if Arctos uses the underlying model to map to, the VertNet IPT would be irrelevant for GBIF, as the IPT would only be able to support publishing models, not the underlying model. If Arctos ends up using a Material publishing model, it would be enabled in the VertNet IPT.
I hope that helps. I'm open to whatever questions.
Resources https://www.gbif.org/new-data-model
Can I ask a basic-level question?
Is DWC only relevant for non-cultural collections? Do we who manage cultural items in Arctos need to be engaged in these discussions for any potential impact to Arctos field names and/or functionality?
potential impact to Arctos field names and/or functionality?
@AJLinn this will have no impact on Arctos field names and/or functionality. We will be looking at how the fields in Arctos are mapped to Darwin Core for publishing to GBIF. If you want to publish your collections (which might be cool), then you might also be interested.
Also, response from Dave.
- No. Not publicly. There has been some testing, but the new data model is just a model.
- You already have as much information as pretty much everyone. There is no public schema because it hasn't been completed.
- Nobody can use any IPT to publish with the new data model because A) see #2 and B) the IPT hasn't been modified and released to utilize the new model. So you may be eager, but it isn't possible to do yet.
There is nothing you have failed to ask. You just happen to be anticipating something that isn't real yet - at least not at scale or in any public way. You can review the Work Programme for 2024 - https://docs.gbif.org/2024-work-programme/en/#priority4. In it you will see that it might be 2027 before the new model is fully formed and ready for wide-spread use (see section 4.4). In the meantime, they do have several goals to expand the model to work with ecological, eDNA and other types of data in 2024.
SO - we are mapping to DwC-A and any extensions we would like to send.
Darwin Core Archive Assistant, User Guide GBIF Registered Extensions
@tucotuco Thanks!
I'd like to review how we're sending geology.
@dustymc Is there a separate mapping for media information?
AWG Member,
The first Darwin Core Mapping Workshop was held on February 12, 2024 but we still have a way to go. The AWG would like to have a second focused workshop to continue review of the Arctos mapping to Darwin Core. If you are interested in participating, please add your availability in this When2Meet by Friday, February 16th and remember to try for two hour blocks.
The focused Github Issue is #7348
Thank you!
Teresa J. Mayfield-Meyer
Can we merge them all into one big-picture actionable remap doc
I hope there's some plan to do this?!
I think it will be easier for the community to review them one by one. Once we have them all settled, we can combine.
The mapping issues that were closed, will these still be addressed?
issues that were closed, will these still be addressed?
PLEASE! Just comment here and I'll adjust the map document. (Or I can allow comments to the map doc? I'm generating SQL from it so someone changing the functional columns could have an outsized impact. I'm up for whatever, but not smart enough to address one SQL statement from 50 issues!)
I fixed some problems with locality attributes in the DWC map. @Nicole-Ridgwell-NMMNHS here's some sample data.
Lithostratigraphy looks good in the sample data.
The Map
https://docs.google.com/spreadsheets/d/1aCBYX9ErjicL8VdNdHbJUI0JTwWu6L4D_37gJ7IneRY/edit?gid=0#gid=0 will be the primary Arctos-->DWC mapping document; please make suggestions/corrections/etc in this issue.
Mapping Test
Here's a sample of DWC generated from the spreadsheet: temp_dwc_sample.csv.zip
Let me know if you need to see this with some particular data, or what I can do to make things clear.
Goals
A clear and functional DWC mapping document.
Scope
This Issue is for mapping to "flat DWC" (DwC-A). Media/AudubonCore (existing mapping) can be addressed elsewhere. Extensions (new mapping) would also need dedicated Issues and justification. (Because some - perhaps most - don't do much.)
Major Change
@mkoo and I believe mapping should be simplified, where only each "best occurrence" (eg what's in FLAT) is shared via DWC; that's in line with current cataloging practices, will exclude mostly things like lower-quality georeferences, will be a huge simplification in mapping and understanding the data, and will not require us to mint fake identifiers (which make GBIF nervous and might well end up in publications).
working comments
In progress: "translate" SQL (https://github.com/ArctosDB/PG_DDL/blob/master/shared_data/dwc_occurrence.sql) to spreadsheet (in a way that can be used to write dynamic SQL).
I'll merge related issues here so they can be addressed in context. It'll take a while.
Some possibly-related issues: https://github.com/ArctosDB/arctos/issues?q=is%3Aissue+is%3Aopen+label%3A%22Aggregator+issues%22