ncbi / datasets

NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
https://www.ncbi.nlm.nih.gov/datasets
Other
369 stars 41 forks source link

Error when creating `dehydrated` package for download #401

Closed SchwarzMarek closed 2 months ago

SchwarzMarek commented 2 months ago

Before opening an issue, please:

Describe the bug Cannot prepare dehydrated package. Provided error Error: Download error: stream error: stream ID 21; INTERNAL_ERROR; received from peer.

I'm lost to where the issue is (tried different connections, PCs and ncbi-dataset-cli versions). Is there something on my end that I can do? (I've used the datasets in this manner successfully in the past.)

To Reproduce

datasets download genome taxon 2 --assembly-level complete --exclude-atypical --assembly-source RefSeq --dehydrated --debug

##### trimmed output #####

Collecting 43,211 genome records [================================================] 100% 43211/43211
2024/09/16 09:44:08 
HTTP/2.0 200 OK
Content-Disposition: attachment; filename=ncbi_dataset.zip
Content-Security-Policy: upgrade-insecure-requests
Content-Type: application/zip
Date: Mon, 16 Sep 2024 07:44:08 GMT
Grpc-Metadata-Logging-Accessions: GCF_000007945.1,GCF_000008505.1,GCF_000022305.1,GCF_000093085.1,GCF_000144625.1,GCF_000146185.1,GCF_000164865.1,GCF_000196515.1,GCF_000987865.1,GCF_001461805.1,GCF_001647655.1,GCF_001647695.1,GCF_001647735.1,GCF_001647765.1,GCF_001647795.1,GCF_001647815.1,GCF_001647835.1,GCF_001647855.1,GCF_001647875.1,GCF_001647895.1,GCF_001647915.1,GCF_001683055.1,GCF_002057455.1,GCF_002208665.2,GCF_003595605.1,GCF_004214875.1,GCF_004295125.1,GCF_008281175.1,GCF_009684695.1,GCF_016889005.1,GCF_019597925.1,GCF_020892115.1,GCF_022845755.1,GCF_022845815.1,GCF_022845835.1,GCF_025149125.1,GCF_027941655.1,GCF_033539415.1,GCF_033539435.1,GCF_033539455.1,GCF_033539475.1,GCF_033539495.1,GCF_033539515.1,GCF_033539535.1,GCF_035940175.1,GCF_036240075.1,GCF_036419135.1,GCF_040267475.1,GCF_900637315.1,GCF_940677205.1
Grpc-Metadata-Logging-Accessions_count: 43211
Grpc-Metadata-Logging-Activity: download
Grpc-Metadata-Logging-Hydrated: 1
Grpc-Metadata-Logging-Include_annotation_type: GENOME_FASTA
Grpc-Metadata-Logging-Include_tsv: False
Grpc-Metadata-Logging-Service: genome
Grpc-Metadata-Via: h2 linkerd
Ncbi-Phid: EAA99DBE6D077E07773EE4AF.49.1
Server: Apache
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Vary: Accept-Encoding
X-Datasets-Version: 16.28.0
X-Ua-Compatible: IE=Edge
X-Xss-Protection: 1; mode=block

Collecting 43,211 genome records [================================================] 100% 43211/43211
Downloading: ncbi_dataset.zip    3MB 40.8kB/s
Error: Download error: stream error: stream ID 21; INTERNAL_ERROR; received from peer

Use datasets download genome taxon <command> --help for detailed help about a command.

Expected behavior Obtain package for rehydrate.

Thank you

Thanks for your feedback--your bug reports help improve NCBI Datasets.

liangje commented 2 months ago

Similar issues on my end: ../datasets download genome taxon vibrio --assembly-version latest --assembly-source GenBank --dehydrated --exclude-multi-isolate --filename vibrio-ncbi.zip --mag exclude New version of client (16.28.0) available at https://ftp.ncbi.nlm.nih.gov/pub/datasets/command-line/LATEST/linux-amd64/datasets. Collecting 8,645 genome records [================================================] 100% 8645/8645 Downloading: vibrio-ncbi.zip 725kB 78.7kB/s Error: Download error: stream error: stream ID 27; INTERNAL_ERROR; received from peer

Use datasets download genome taxon --help for detailed help about a command.

olearyna commented 2 months ago

Hi liangje and SchwarzMarek

Thank you for reporting this issue. We have been able to reproduce the problem and are currently investigating it. We'll keep you updated on our progress and will work towards implementing a fix as soon as possible.

Thanks for your patience!

Nuala

ericcox1 commented 2 months ago

Hi @liangje and @SchwarzMarek,

This bug has been fixed. Please update to the latest version, 16.29.0.

Thanks again for your report.

Best, Eric

Eric Cox, PhD [Contractor] (he/him/his) NCBI Datasets NIH/NLM/NCBI eric.cox@nih.gov