ioos / bio_data_guide

Standardizing Marine Biological Data Working Group - An open community to facilitate the mobilization of biological data to OBIS.
https://ioos.github.io/bio_data_guide/
MIT License
47 stars 21 forks source link

MARINe/PISCO: intertidal swath density surveys #244

Closed MarineLebrec closed 5 months ago

MarineLebrec commented 9 months ago

Contact details

mlebrec@mbari.org

Dataset Title

MARINe/PISCO: Intertidal: MARINe Coastal Biodiversity Surveys: Swath Surveys Summarized

Describe your dataset and any specific challenges or blockers you have or anticipate.

This dataset includes ~20 years of intertidal swath transects to estimate the density of seastars and abalone from a number of monitoring sites along the US West Coast, managed by the Multi-Agency Rocky Intertidal Network (MARINe). I worked on mobilizing a similar dataset last year (intertidal point contact data). This workshop will allow me to devote time to mobilizing another large dataset to OBIS/GBIF and answer any questions I have along the way.

Info about "raw" Data Files.

The data is published to DataONE here: https://data.piscoweb.org/metacatui/view/doi%3A10.6085%2FAA%2Fmarine_cbs.11.6

MarineLebrec commented 7 months ago

I made some great progress on this dataset in our recent Data Mobilization Workshop - thanks to the organizers! The DwC-A that I generated is ready for review. Once everything looks good, @sformel-usgs will train me up on using the OBIS IPT so I can submit this myself.

One piece I could use feedback on is: are the NERC vocabularies that I used in my EMOF table adequate for organism density and sample area? I found vocabularies from the NERC P06 and P09 collections, but in one of the workshop material pages (under the EMOF section), it says "When using NERC vocabulary terms, you must choose a term from the P01 collection."

Thanks for any feedback! 😄

sformel-usgs commented 6 months ago

@MarineLebrec I will try to look at this in the next week, but anyone else is welcome to give it a review too.

sformel-usgs commented 6 months ago

@MarineLebrec generally the dataset looks good! My video for the IPT training isn't ready yet, so let's plan some time for us to publish it together. I'll send you an email about it. There are a couple of small things worth revising:

Feedback:

  1. scientificName has values that include spp after the genus (e.g. Echinaster spp). Although I believe this can be correctly interpreted by OBIS and GBIF, I think it is less noisy to put thisin verbatimIdentification and for scientificName to only include the genus (e.g .Echinaster). Here is an R snippet that will accomplish this:
occ_data_frame |>
  rename(verbatimIdentification = scientificName) |>
  mutate(scientificName = stringr::str_remove(verbatimIdentification, pattern = " spp")) 
  1. You have multiple kingdom columns in the occurrence file. They are the same information, but it will save confusion if you remove one.

  2. For the vocab codes, I know it's confusing. I got lost in it myself and then I realized the terms you had chosen for measurementTypeID had P01 synonyms. So:

One of the things I learned is that if you find a term that's in the 'wrong' vocabulary for your needs, look at the bottom for any instances described as Same As: image

MarineLebrec commented 6 months ago

I so appreciate you taking a look at this @sformel-usgs !! I'll devote some time to addressing your feedback next week and will reach back out with questions. Once we are both happy with it, I'll book a spot in your calendar for an IPT overview.

MarineLebrec commented 6 months ago

Update: I've been able to make the suggested corrections to my DwC-A, which is reflected in the same Github repo: https://github.com/CeNCOOS/MPA_data_integration/tree/main/MARINe/CBS_swathsurveys

We'll be meeting on June 17th to go through the IPT, EML metadata, and any other last steps! Thanks for all your help.

MarineLebrec commented 5 months ago

Thanks to much appreciated help from @sformel-usgs we got this dataset published so I'll close this issue :) GBIF: https://www.gbif.org/dataset/fdcdc447-2032-4edf-9519-0ec89ae1b9c5 OBIS: https://obis.org/dataset/30884b6c-e8e1-453c-af20-7ed8318489c6