markziemann / dee2

Digital Expression Explorer 2 (DEE2): a repository of uniformly processed RNA-seq data
http://dee2.io
GNU General Public License v3.0
39 stars 7 forks source link

prefetch error trying to run the docker container #100

Open rcastelo opened 1 year ago

rcastelo commented 1 year ago

hi,

Looking up at the server files for human, apparently, there have not been new samples added to the DEE2 pipeline since 2021. For that reason, I want to run the DEE2 pipeline through some recent datasets I'm interested in (e.g., GSE115760). When running the docker container, I've encountered an error related to prefetch, which can be reproduced with the example call using Ecoli given in the README:

$ docker run mziemann/tallyup ecoli SRR2637695,SRR2637696,SRR2637697,SRR2637698

Starting pipeline with species ecoli and accession SRR2637695
SRR2637695
User input species and SRA metadata match. OK.
Starting /dee2/code/volunteer_pipeline.sh SRR2637695
    current disk space = 355352172
    free memory = 393678772 
SRR2637695 check if SRA file exists and download if neccessary

2023-02-20T13:48:27 prefetch.2.8.2 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -9984 ( X509 - Certificate verification failed, e.g. CRL, CA or signature check failed )
2023-02-20T13:48:27 prefetch.2.8.2 sys: mbedtls_ssl_get_verify_result returned 0x8 (  !! The certificate is not correctly signed by the trusted CA  )
2023-02-20T13:48:27 prefetch.2.8.2 err: path not found while resolving tree within virtual file system module - 'SRR2637695' cannot be found.
SRR2637695 failed download with prefetch
rm: cannot remove '*fastq': No such file or directory
rm: cannot remove '*.sra': No such file or directory
rm: cannot remove '*tsv': No such file or directory
mv: cannot stat '/dee2/ncbi/public/sra/SRR2637695.sra': No such file or directory

do I have to configure anything on my side to enable the download of the SRA data from NCBI?

thanks!

markziemann commented 1 year ago

Hi Robert, Thanks for your email.

It looks like the prefetch being used isn't working anymore. I will update the docker image to the latest prefetch to see if that solves the problem.

There is another approach. Use a separate script to prefetch the SRA archives. Ensure that the archives are named in the following way:

hsapiens_SRR7309392.sra, hsapiens_SRR7309393.sra, etc

Then the pipeline will work using the -d option.

docker run -v $(pwd):/dee2/mnt mziemann/tallyup hsapiens -d Hope this helps. Mark

On Tue, 21 Feb 2023 at 01:57, Robert Castelo @.***> wrote:

hi,

Looking up at the server https://dee2.io/huge/hsapiens files for human, apparently, there have not been new samples added to the DEE2 pipeline since 2021. For that reason, I want to run the DEE2 pipeline through some recent datasets I'm interested in (e.g., GSE115760 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115760). When running the docker container, I've encountered an error related to prefetch, which can be reproduced with the example call using Ecoli given in the README:

$ docker run mziemann/tallyup ecoli SRR2637695,SRR2637696,SRR2637697,SRR2637698

Starting pipeline with species ecoli and accession SRR2637695 SRR2637695 User input species and SRA metadata match. OK. Starting /dee2/code/volunteer_pipeline.sh SRR2637695 current disk space = 355352172 free memory = 393678772 SRR2637695 check if SRA file exists and download if neccessary

2023-02-20T13:48:27 prefetch.2.8.2 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -9984 ( X509 - Certificate verification failed, e.g. CRL, CA or signature check failed ) 2023-02-20T13:48:27 prefetch.2.8.2 sys: mbedtls_ssl_get_verify_result returned 0x8 ( !! The certificate is not correctly signed by the trusted CA ) 2023-02-20T13:48:27 prefetch.2.8.2 err: path not found while resolving tree within virtual file system module - 'SRR2637695' cannot be found. SRR2637695 failed download with prefetch rm: cannot remove 'fastq': No such file or directory rm: cannot remove '.sra': No such file or directory rm: cannot remove '*tsv': No such file or directory mv: cannot stat '/dee2/ncbi/public/sra/SRR2637695.sra': No such file or directory

do I have to configure anything on my side to enable the download of the SRA data from NCBI?

thanks!

— Reply to this email directly, view it on GitHub https://github.com/markziemann/dee2/issues/100, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKVGIF3OCPO5UUMBUCSYSLWYOA4RANCNFSM6AAAAAAVB7IDQQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

markziemann commented 1 year ago

I've just updated the Docker image and now the prefetch is working okay. You will need to pull the new image for it to work.

rcastelo commented 1 year ago

Excellent, thanks!! Two further questions, I've pulled the new image and successfully run:

$ docker run mziemann/tallyup ecoli SRR2637695,SRR2637696,SRR2637697,SRR2637698

however, after it finished, there is no file created in the current directory from where I ran the previously line.

Do I have to bindmount a specific path from the docker filesystem to get the resulting ZIP files?

Once I run it with the human datasets I'm interested in, how can I contribute them to the DEE2 server?

Thanks again!

markziemann commented 1 year ago

You'll need to find out the container ID for the analysis. If it was the last docker task on the computer it can be found like this:

docker ps -alq

You can copy out the data from that container to the current working directory like this:

docker cp $(docker ps -alq):/dee2/data/ .