langmead-lab / monorail-external

examples to run monorail externally
MIT License
13 stars 5 forks source link

SRA Toolkit in need of update #20

Closed GauravR31 closed 1 year ago

GauravR31 commented 1 year ago

Hello,

Could you please update the sra-toolkit being used in the Pump Docker image? The Docker container currently uses prefetch-2.9.1 which seems to have networking issues with fetching SRA data ("Failed to create TLS stream", "mbedtls_ssl_handshake returned -9984"). Most resources online point to updating to the latest version of sra-toolkit, which comes with prefetch-3.0.0 (or higher), which solves this issue.

ChristopherWilks commented 1 year ago

Updating prefetch/SRA-toolkit would be a reasonable thing to do. That said, it's not as simple as just switching out the versions.
This is due to changes in the interface/configuration between versions, e.g. how 3.0.0 handles dbGaP keys vs. 2.9 is different enough that it requires some code change(s) in Monorail.

I'll look into how extensive this change would be given my limited time to support this and post back.

SKarr07 commented 1 year ago

Hello, are you still working on these?, I had the same issue as the original comment:

prefetch.2.9.1 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -9984 ( X509 - Certificate verification failed, e.g. CRL, CA or signature check failed )

GauravR31 commented 1 year ago

@SKarr07 For what it's worth, I moved the downloading part out of the container by installing sra-toolkit 3.0.0 on the machine where the container runs, downloading the FASTQ files and then just run each project as 'local'. Hope this helps!

ChristopherWilks commented 1 year ago

@GauravR31's solution is a reasonable work-around for now, though I may still release an updated container when I get around to it, but no guarantees on that. Also, given that both the source code and the existing containers are public, others can update it themselves, if they're so inclined.

ChristopherWilks commented 1 year ago

I have updated the recount/monorail pump image to 1.1.0 which now uses sratoolkit 2.11.2. While that's not 3.x.x, it does seem to work with both old and new accessions (at least public ones, I have not tested dbGaP ones as yet). I have also included the ability to override the main download script used by pump, so with some configuration work, one could use the current pump image with custom download code including a newer sratoolkit version.

ChristopherWilks commented 1 year ago

actually, I've updated to use sratoolkit 3.0.0 today, in recount-pump:1.1.1 container image. 2.11.2 didn't seem to work with dbGaP downloads.