HakaiInstitute / metadata-review

0 stars 0 forks source link

Dataset - MusselSeg: Semantic Segmentation for Rocky Intertidal Mussel Habitat #104

Open hakai-it opened 1 month ago

hakai-it commented 1 month ago

MusselSeg: Semantic Segmentation Dataset for Rocky Intertidal Mussel Habitat

https://hakaiinstitute.github.io/hakai-metadata-entry-form/#/en/hakai/DCH8GzsQKSM8xwJ8WdQFIZrtCaq2/-O26fyGiVXoF__M22UrP

Best Practices Checklist

In General

Data Identification

Dataset title:

Abstract

DOI

Spatial

Contact

Resources

JessyBarrette commented 1 month ago

@willhakai Thanks for submitting a metadata record for the dataset:

MusselSeg: Semantic Segmentation Dataset for Rocky Intertidal Mussel Habitat

Couple thoughts:

timvdstap commented 1 month ago

I think @tayden is going to add you to the Hakai HuggingFace account Jessy :) - also, I think that #105 is this records' duplicate?

tayden commented 1 month ago

The dataset is located here: https://huggingface.co/datasets/HakaiInstitute/mussel-seg-1024-1024 It's currently private, so you'll have to make an account and I can then add you to see it. It'll be made public once it gets the rubber stamp of approval.

willhakai commented 1 month ago

@tayden is creating a DOI on HuggingFace

Other changes made, thanks!

tayden commented 1 month ago

@tayden is creating a DOI on HuggingFace

Other changes made, thanks!

Yes I will do this. I can't do it until the dataset is public, so I will then. The HF DOIs will track the dataset versions automatically, so it'll be easier to do there rather than with an external service

fostermh commented 1 month ago

hmmm for the DOI are we talking about one that points to the data resource or the metadata record. The normal way we do this is to generate a DOI that points to the metadata record so that we can change the location of data storage if needed. There is a handy button in the form for just this. Or does HF require a doi that points to their site? if so we can do both.

tayden commented 1 month ago

HF has one that points to the dataset, on HF. The main advantage to the HF one is that you can track a specific dataset revision.

That being said, there's no reason I couldn't reference a DOI that links to the metadata record instead, but we'd lose the automatic revision tracking.

fostermh commented 1 month ago

ok, so generating both in this case would make sense then. one on HF that points to their page, another on the form that points to our page, and we indicate that the HF one is identical to the hakai one, which we can do by adding the HF doi under 'related works'.

I realize that from a technical perspective non of this is needed and we could just use the HF doi. The issue being addressed here is one of appropriately representing ownership and credit. Wherever possible we want to link external data/metadata back to hakai records so that we can have an accurate list of hakai data holdings.

Anyway, the short version is Taylor you should carry on generating a doi on HF and link it to HF then add it to the metadata record under related works.

Thank you.

tayden commented 1 month ago

Thanks @fostermh!

tayden commented 1 month ago
  • Does not include the word “dataset”

I've updated this and removed "dataset"

JessyBarrette commented 1 month ago

Ok I cleaned up the contacts for hakai/tula to not have duplicated names.

We can certainly add a DOI to this specific repo, you can then still generate a DOI on hugginface and reference your version specific DOI for any publications. We can ourself tracks related DOIs external to the organization.