saketkc / pysradb

Package for fetching metadata and downloading data from SRA/ENA/GEO
https://saketkc.github.io/pysradb
BSD 3-Clause "New" or "Revised" License
307 stars 50 forks source link

IO Error - No connection adapters for NCBI #162

Closed BatheFu closed 2 years ago

BatheFu commented 2 years ago

Description

I tried to download a subset of SRP and failed.

What I Did

pysradb metadata SRP218975 --detailed | grep 'organism\|blood' | pysradb download
The supplied url column "None" cannot be found.

Using recommended_url instead.

Checking download URLs
Key error for: ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695.sra
Key error for: ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694/SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694.sra
Key error for: ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009457 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747693/SRR10009457 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747693.sra
Key error for: ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009456 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747692/SRR10009456 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747692.sra
The following files will be downloaded: 

run_accession                                                                                                            study_accession                           experiment_accession                                                                                                                                                                 recommended_url download_url                                                                                                                                                                                                                                                                                                                      out_dir                    filesize
SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695 GSM4041173: Blood4; Homo sapiens; RNA-Seq 9606 Homo sapiens <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS5297927 <NA> Illumina HiSeq 4000 Illumina HiSeq 4000 ILLUMINA 332443402 29940496781 332443402 52526057516 GSM4041173_r1 None            ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695.sra /content/pysradb_downloads 0.0     
SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694 GSM4041172: Blood3; Homo sapiens; RNA-Seq 9606 Homo sapiens <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS5297926 <NA> Illumina HiSeq 4000 Illumina HiSeq 4000 ILLUMINA 336691930 29856759099 336691930 53197324940 GSM4041172_r1 None            ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694/SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694.sra /content/pysradb_downloads 0.0     
SRR10009457 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747693 GSM4041171: Blood2; Homo sapiens; RNA-Seq 9606 Homo sapiens <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS5297925 <NA> Illumina HiSeq 4000 Illumina HiSeq 4000 ILLUMINA 332382503 29677771591 332382503 52516435474 GSM4041171_r1 None            ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009457 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747693/SRR10009457 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747693.sra /content/pysradb_downloads 0.0     
SRR10009456 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747692 GSM4041170: Blood1; Homo sapiens; RNA-Seq 9606 Homo sapiens <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS5297924 <NA> Illumina HiSeq 4000 Illumina HiSeq 4000 ILLUMINA 318845103 28952300875 318845103 50377526274 GSM4041170_r1 None            ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009456 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747692/SRR10009456 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747692.sra /content/pysradb_downloads 0.0     

Total size: 0.0

IO Error - No connection adapters were found for 'ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695.sra'
  0% 0/4 [00:00<?, ?it/s]
IO Error - No connection adapters were found for 'ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR100/SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694/SRR10009458 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747694.sra'
Traceback (most recent call last):
  File "/usr/local/bin/pysradb", line 8, in <module>
    sys.exit(parse_args())
  File "/usr/local/lib/python3.9/site-packages/pysradb/cli.py", line 1211, in parse_args
    download(
  File "/usr/local/lib/python3.9/site-packages/pysradb/cli.py", line 114, in download
    sradb.download(
  File "/usr/local/lib/python3.9/site-packages/pysradb/sradb.py", line 1548, in download
    thread_map(
  File "/usr/local/lib/python3.9/site-packages/tqdm/contrib/concurrent.py", line 94, in thread_map
    return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs)
  File "/usr/local/lib/python3.9/site-packages/tqdm/contrib/concurrent.py", line 76, in _executor_map
    return list(tqdm_class(ex.map(fn, *iterables, **map_args), **kwargs))
  File "/usr/local/lib/python3.9/site-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 608, in result_iterator
    yield fs.pop().result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 438, in result
    return self.__get_result()
  File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/pysradb/sradb.py", line 91, in _handle_download
    download_file(download_url, srr_location)
  File "/usr/local/lib/python3.9/site-packages/pysradb/download.py", line 176, in download_file
    if file_size == os.path.getsize(tmp_file_path):
  File "/usr/local/lib/python3.9/genericpath.py", line 50, in getsize
    return os.stat(filename).st_size
FileNotFoundError: [Errno 2] No such file or directory: '/content/pysradb_downloads/GSM4041173: Blood4; Homo sapiens; RNA-Seq/9606 Homo sapiens <NA> RNA-Seq TRANSCRIPTOMIC cDNA PAIRED SRS5297927 <NA> Illumina HiSeq 4000 Illumina HiSeq 4000 ILLUMINA 332443402 29940496781 332443402 52526057516 GSM4041173_r1/SRR10009459 SRP218975 Resolving the fibrotic niche of human liver cirrhosis using single-cell transcriptomics SRX6747695.sra.part'
saketkc commented 2 years ago

Thanks for reporting this. For now, I would recommend saving the metadata to a file and then using curl/wget to download:

pysradb metadata SRP218975 --detailed  --saveto x.tsv