gbif / data-mobilization

For capturing and discussing potential datasets suitable for publishing to GBIF
Apache License 2.0
13 stars 2 forks source link

Ag1000G phase 3 haplotypes data release #289

Open gbif-portal opened 2 years ago

gbif-portal commented 2 years ago

Ag1000G phase 3 haplotypes data release

Dataset link: https://www.malariagen.net/data/ag1000g-phase3-hap

Region: Africa

Taxon: Mosquitoes

Type: sampling event

Priority: medium

Bibliographic reference: The Anopheles gambiae 1000 Genomes Consortium (2021): Ag1000G phase 3 haplotypes data release. MalariaGEN. https://www.malariagen.net/data/ag1000g-phase3-hap

Comments: 'This data release includes phased haplotypes for 2,784 wild-caught mosquitoes collected from 19 countries in sub-Saharan Africa. These haplotypes can be analysed directly or used as haplotype reference panels to improve phasing of other samples. Three mosquito species are represented: Anopheles gambiae, Anopheles coluzzii and Anopheles arabiensis.'

Dataholders contact information: Jessica Way jway@broadinstitute.org

Users contact info: kingenloff@gbif.org

CecSve commented 1 year ago

Is haplotypes something we have published before to your knowledge @dschigel?

tobiasgf commented 1 year ago

I assume that it may just be the associated occurrences that @kingenloff is thinking about mobilizing?

kingenloff commented 1 year ago

There are three datasets like this that I thought might be worth exploring --- not just the occurrences, but also the DNA info if possible. But I wanted to speak more to the DNA mobilization experts to discuss what types we are trying to mobilize or not.

CecSve commented 1 year ago

Sure we can probably publish occurrences based on all three proposed datasets, but should look into what data can be shared through GBIF in the current structure

dschigel commented 1 year ago

Is haplotypes something we have published before to your knowledge @dschigel? Not to my knowledge, but this is not the reason to ignore these - similarly to agrobiodiversity, meaningful health related detections are carried out on the subspecific levels - agro has cultivars, land races etc, simialry species in mosquito vectored diseases often has little meaning, but populations, haplotypes, ssp do - we can bring this up with the task group, the meaningful taxonomic levels. This is also something for the data model, and for levels in the future backbones