gbif / portal-feedback

User feedback for the GBIF API, website and published data. You can ask questions here. 🗨❓
30 stars 16 forks source link

Crediting authors in INSDC dataset #4982

Open ymgan opened 1 year ago

ymgan commented 1 year ago

Hi,

Referring to the dataset and records below

GBIF dataset: INSDC Sequences GenBank record: https://www.ncbi.nlm.nih.gov/nuccore/MZ823800 GBIF occurrence record: https://www.gbif.org/occurrence/3969152007

collected_by is a single value field in GenBank submission. It is nice that the person is acknowledged. However, the author who performed the sequencing and identification is not listed anywhere in the occurrence record (Altenburger,A. in this example). Is there any plan to improve the acknowledgement of authors (of GenBank submission) for INSDC dataset in GBIF please?

Thanks a lot!

@aaltenburger2 @rukayaj

thomasstjerne commented 1 year ago

We retrieve data through the EBI portal API like this https://www.ebi.ac.uk/ena/portal/api/search?result=sequence&format=json&limit=100&query=(country=%22Denmark*%22%20AND%20tax_id=704171)&fields=doi,country,sample_accession,accession,location,country,identified_by,collected_by,collection_date,specimen_voucher,sequence_md5,scientific_name,tax_id,altitude,sex,description

There is unfortunately not much info about the publication there. Full list of available attributes are here: https://www.ebi.ac.uk/ena/portal/api/returnFields?dataPortal=ena&format=json&result=sequence