ncbi / datasets

NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
https://www.ncbi.nlm.nih.gov/datasets
Other
369 stars 41 forks source link

`ncbi_dataset/data/data_report.jsonl` sometimes missing #328

Closed bernt-matthias closed 8 months ago

bernt-matthias commented 8 months ago

When executing:

datasets download gene accession 'WP_004675351.1'   --include-flanks-bp 100   --include gene,protein --no-progressbar
dataformat tsv prok-gene --package ncbi_dataset.zip --fields accession,description,ec-number,gene-symbol,mapping-count,protein-length,protein-name > gene_data_report.tsv

I get Error: no matching files found for [ncbi_dataset/data/data_report.jsonl]

ericcox1 commented 8 months ago

Hi @bernt-matthias,

Thanks for opening this issue. This is a known bug, where we don't return the data report for WP proteins that have been suppressed. WP_004675351.1 was suppressed for the following reason: This protein record was suppressed because it is no longer annotated on any genome.

I'll bring this up with the team and I'll comment on this thread with any updates.

Best, Eric

ericcox1 commented 8 months ago

Hi @bernt-matthias,

We won't be able to tackle this bug in the near term. I'm closing this ticket for now but I will reopen this issue if we are able to revisit this.

Thanks again for your report.

Best, Eric