gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Hello, #5033

Closed gbif-portal closed 11 months ago

gbif-portal commented 11 months ago

Hello,


User: See in registry - Send email System: Chrome 118.0.0 / Windows 10.0.0 Referer: https://www.gbif.org/occurrence/1949672798 Window size: width 1912 - height 966 API log&_a=(columns:!(_source),filters:!(),index:'3390a910-fcda-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) Site log&_a=(columns:!(_source),filters:!(),index:'5c73f360-fce3-11ea-a9ab-4375f2a9d11c',interval:auto,query:(language:kuery,query:''),sort:!())) System health at time of feedback: OPERATIONAL datasetKey: 3b8c5ed8-b6c2-4264-ac52-a9d772d69e9f publishingOrgKey: 299958e0-4c06-11d8-b290-b8a03c50a862

Node handles: @DanBIF

MenthaA commented 11 months ago

Hello, sorry for the empty message - bad manipulation. I work at the Conservatory and Botanical garden of Geneva and we are working on mapping our DNA data to DwCA, using GGBN extensions. A blog post (https://data-blog.gbif.org/post/gbif-molecular-data/) mentionned that this dataset uses such extensions, whose fields are not indexed in the portal and thus should not be visible on the occurrence page. They also mentionned that it should be visible when downloading the DwCA for the dataset. I downloaded the DwCA and could not find those fields. Neither could I identify a field which serves for the DNA derived data present on the occurrence page. Where could I find such a "DwC(A) upload file" that was used to import DNA derived data onto GBIF to serve as an example for formatting our datasets?

Thank you in advance,

Anouk Mentha

ManonGros commented 11 months ago

Thanks @MenthaA for the question. It looks like my blogpost may be out dated now. There is indeed no GGBN extension in that dataset (only a sequence extension). Here is a search query for all the datasets that contain a DNA derived extension: https://www.gbif.org/occurrence/datasets?advanced=1&occurrence_status=present&dwca_extension=http:~2F~2Frs.gbif.org~2Fterms~2F1.0~2FDNADerivedData. Note that the DWCA extension field in the advanced filter allows you to select occurrences that are associated with various extensions. See the screenshot below:

Screenshot 2023-11-08 at 08 59 01

You can download the source archives from the dataset pages.

I think there should be some examples as well in the DNA-derived data publishing guide:

Abarenkov K, Andersson AF, Bissett A, Finstad AG, Fossøy F, Grosjean M, Hope M, Jeppesen TS, Kõljalg U, Lundin D, Nilsson RN, Prager M, Provoost P, Schigel D, Suominen S, Svenningsen C & Frøslev TG (2023) Publishing DNA-derived data through biodiversity data platforms, v1.3. Copenhagen: GBIF Secretariat. https://doi.org/10.35035/doc-vf1a-nr22

Let me know if this answers your question.

MenthaA commented 11 months ago

Thank you @ManonGros, this answers my question : I wasn't aware that we could download the source archive from the dataset page (I always downloaded from the occurrences list or a selection of occurrences until today). I was wondering why I could notsee the extensions while downloading a selection of occurrences, but I guess this is due to the GBIF download capabilities - restricted to default core (occurrence or event) and extensions (multimedia, verbatim, rights and citation)?

Anyway, now I can have a look at the source archives, it is great. Thank you!

ManonGros commented 11 months ago

Yes, our system doesn't offer extension download through the occurrence search and download API (and interface). We are exploring how this could be made possible though. If you are interested, you can keep an eye on this issue: https://github.com/gbif/occurrence/issues/222