nextstrain / pathogen-repo-guide

4 stars 1 forks source link

ingest: Provide target for raw metadata from NCBI Datasets #38

Closed joverlee521 closed 7 months ago

joverlee521 commented 7 months ago

Provides an easy way for first time users to get the full uncurated metadata from NCBI Datasets commands by running the ingest workflow with the specified target dump_ncbi_dataset_report. They can then inspect and explore the raw data to determine if they want to configure the workflow to use additional fields from NCBI.

The rule is added to fetch_from_ncbi.smk to make it easy to run without additional configs. Note that it is not run as part of the default workflow and only intended to be used as a specified target.

Prompted by @jameshadfield in review of the tutorial¹ and resolves https://github.com/nextstrain/pathogen-repo-guide/issues/30.

¹ https://github.com/nextstrain/docs.nextstrain.org/pull/195 (comment)

joverlee521 commented 7 months ago

Merging to merge https://github.com/nextstrain/docs.nextstrain.org/pull/195