Identify what terms need to be included in this ontology by querying the NMDC database

I consider MongoDB to be the system of record for NMDC Biosample metadata at this time. AN API call like this will retrieve the data relevant for this issue:

https://api.microbiomedata.org/nmdcschema/biosample_set?max_page_size=20&projection=env_broad_scale%2Cenv_local_scale%2Cenv_medium

How many Biosamples do we have?

https://api.microbiomedata.org/nmdcschema/collection_stats

About 8000. So the max_page_size will have to be increased, or we will have to iterate over pages

The absence of a next_page_token value indicates that all of the Biosamples can be retrieved in a singe request, at least at this point..

https://api.microbiomedata.org/nmdcschema/biosample_set?max_page_size=9999&projection=env_broad_scale%2Cenv_local_scale%2Cenv_medium

microbiomedata / nmdc-ontology

Identify what terms need to be included in this ontology by querying the NMDC database #22

3