microbiomedata / nmdc-runtime

Runtime system for NMDC data management and orchestration
https://microbiomedata.github.io/nmdc-runtime/
Other
5 stars 3 forks source link

Add INSDC identifiers to NMDC studies on `insdc_bioproject_identifiers` slot #557

Open sujaypatil96 opened 3 months ago

sujaypatil96 commented 3 months ago

There is data for a number of studies in the NMDC. For a list of all the studies. see here: https://api.microbiomedata.org/docs#/find/find_studies_studies_get

One of the requirements as part of the NCBI Export squad is the need to have the insdc_bioproject_identifiers slot on the Study class objects populated. Reason being, we need to know from the presence/absence of information on this slot whether or not we need to create a new BioProject in NCBI.

This issue is requesting that we populate the insdc_bioproject_identifiers slot for all studies in NMDC with the shared INSDC identifiers if they are present.