Closed MarineLebrec closed 5 months ago
I made some great progress on this dataset in our recent Data Mobilization Workshop - thanks to the organizers! The DwC-A that I generated is ready for review. Once everything looks good, @sformel-usgs will train me up on using the OBIS IPT so I can submit this myself.
One piece I could use feedback on is: are the NERC vocabularies that I used in my EMOF table adequate for organism density and sample area? I found vocabularies from the NERC P06 and P09 collections, but in one of the workshop material pages (under the EMOF section), it says "When using NERC vocabulary terms, you must choose a term from the P01 collection."
Thanks for any feedback! 😄
@MarineLebrec I will try to look at this in the next week, but anyone else is welcome to give it a review too.
@MarineLebrec generally the dataset looks good! My video for the IPT training isn't ready yet, so let's plan some time for us to publish it together. I'll send you an email about it. There are a couple of small things worth revising:
scientificName
has values that include spp
after the genus (e.g. Echinaster spp). Although I believe this can be correctly interpreted by OBIS and GBIF, I think it is less noisy to put thisin verbatimIdentification
and for scientificName
to only include the genus (e.g .Echinaster). Here is an R snippet that will accomplish this:occ_data_frame |>
rename(verbatimIdentification = scientificName) |>
mutate(scientificName = stringr::str_remove(verbatimIdentification, pattern = " spp"))
You have multiple kingdom
columns in the occurrence file. They are the same information, but it will save confusion if you remove one.
For the vocab codes, I know it's confusing. I got lost in it myself and then I realized the terms you had chosen for measurementTypeID
had P01 synonyms. So:
http://vocab.nerc.ac.uk/collection/P09/current/ABB2/
becomes http://vocab.nerc.ac.uk/collection/P01/current/SDBIOL02/
http://vocab.nerc.ac.uk/collection/P09/current/ABED/
becomes http://vocab.nerc.ac.uk/collection/P01/current/AREABEDS/
One of the things I learned is that if you find a term that's in the 'wrong' vocabulary for your needs, look at the bottom for any instances described as Same As
:
I so appreciate you taking a look at this @sformel-usgs !! I'll devote some time to addressing your feedback next week and will reach back out with questions. Once we are both happy with it, I'll book a spot in your calendar for an IPT overview.
Update: I've been able to make the suggested corrections to my DwC-A, which is reflected in the same Github repo: https://github.com/CeNCOOS/MPA_data_integration/tree/main/MARINe/CBS_swathsurveys
We'll be meeting on June 17th to go through the IPT, EML metadata, and any other last steps! Thanks for all your help.
Thanks to much appreciated help from @sformel-usgs we got this dataset published so I'll close this issue :) GBIF: https://www.gbif.org/dataset/fdcdc447-2032-4edf-9519-0ec89ae1b9c5 OBIS: https://obis.org/dataset/30884b6c-e8e1-453c-af20-7ed8318489c6
Contact details
mlebrec@mbari.org
Dataset Title
MARINe/PISCO: Intertidal: MARINe Coastal Biodiversity Surveys: Swath Surveys Summarized
Describe your dataset and any specific challenges or blockers you have or anticipate.
This dataset includes ~20 years of intertidal swath transects to estimate the density of seastars and abalone from a number of monitoring sites along the US West Coast, managed by the Multi-Agency Rocky Intertidal Network (MARINe). I worked on mobilizing a similar dataset last year (intertidal point contact data). This workshop will allow me to devote time to mobilizing another large dataset to OBIS/GBIF and answer any questions I have along the way.
Info about "raw" Data Files.
The data is published to DataONE here: https://data.piscoweb.org/metacatui/view/doi%3A10.6085%2FAA%2Fmarine_cbs.11.6