ncbi / sra-tools

SRA Tools
Other
1.12k stars 246 forks source link

prefetch downloads additional binary files aside from sra #791

Closed malonzm1 closed 1 year ago

malonzm1 commented 1 year ago

Hi,

When I use prefetch to download scRNAseq sra files, several additional files such as CM000663.2 are downloaded. The log says: 2023-03-16T07:09:18 prefetch.3.0.3: Downloading via HTTPS... 2023-03-16T07:19:49 prefetch.3.0.3: HTTPS download succeed 2023-03-16T07:20:22 prefetch.3.0.3: 'SRR7159839' is valid 2023-03-16T07:20:22 prefetch.3.0.3: 1) 'SRR7159839' was downloaded successfully 2023-03-16T07:21:22 prefetch.3.0.3: 'SRR7159839' has 115 unresolved dependencies 2023-03-16T07:21:22 prefetch.3.0.3: 2) Downloading 'ncbi-acc:CM000663.2?vdb-ctx=refseq'... 2023-03-16T07:21:22 prefetch.3.0.3: Downloading via HTTPS... 2023-03-16T07:21:26 prefetch.3.0.3: HTTPS download succeed

and so on. I don't experience this with bulk rna-seq. May I know what these files are and what they do? Are they necessary for generating fastq files from sra?

Thanks and good day.

klymenko commented 1 year ago

The short answer is: yes. SRR7159839 is compressed against references. Additional files are reference sequences that are necessary for generating fastq.

malonzm1 commented 1 year ago

Thanks!