LieberInstitute / recount3

Explore and download data from the recount3 project
http://lieberinstitute.github.io/recount3
31 stars 4 forks source link

database disk image is malformed #54

Open EperLuo opened 2 months ago

EperLuo commented 2 months ago

Please ask questions about how to use recount3 on the Bioconductor Support Site using the appropriate tag(s) including the one for this package.

Note. Update the issue title to concisely describe the bug.

Hi! Thank you for this wonderful work. I was trying to download data via create_rse function, but encounter the following bug:

2024-07-03 17:09:59.968347 caching file sra.sra.SRP167804.MD.gz.
adding rname 'http://duffel.rail.bio/recount3/mouse/data_sources/sra/metadata/04/SRP167804/sra.sra.SRP167804.MD.gz'
Error in BiocFileCache::bfcrpath(bfc, url, exact = TRUE, verbose = verbose) : 
  not all 'rnames' found or unique.
Calls: create_rse ... file_retrieve -> vapply -> FUN -> <Anonymous> -> <Anonymous>
In addition: Warning message:
In value[[3L]](cond) : 
trying to add rname 'http://duffel.rail.bio/recount3/mouse/data_sources/sra/metadata/04/SRP167804/sra.sra.SRP167804.MD.gz' produced error:
  database disk image is malformed
Execution halted

Provide a minimally reproducible example (reprex)

This is my code. I can use it to download some of the datasets. But for most of the dataset, the above error occurred.

  proj_info <- subset(
    projects,
    project == SRA_Accession & project_type == "data_sources"
  )

  rse_gene <- create_rse(proj_info[1,])

I wonder if it is an internet connection problem. Can I manage to solve it? Any response would be much appreciated!

lcolladotor commented 1 month ago

Hi @EperLuo,

I believe that you are running into the issue described at https://github.com/Bioconductor/BiocFileCache/issues/48 and https://github.com/curl/curl/issues/13725.

What is the output of curl::curl_version() for you? Is it version 8.6.0?

best, Leo