Closed maciejmotyka closed 2 years ago
Hello, I've not tried to build the Docker image recently, as it is simply easier to pull it from Docker hub. Could you try that?
Pulling image from docker hub works.
I wanted to re-build the image with a newer version of the SRA Toolkit, because prefetch
has been extremely unreliable lately, causing the pipeline to fail constantly. I'm not sure if it's my network or they changed something at SRA database, but prefetch fails 9/10 times when downloading ~500mb file, e.g. SRR2637697.
I have the newest version of prefetch
locally, and it takes whole 5 minutes to download this run over http, but at least it doesn't fail.
Anyway, I fixed the pip Python conflict by starting from Ubuntu 18.04, but then it crashed due to invalid link to STAR binary at:
Step 26/34 : RUN cd sw && wget -c "https://github.com/alexdobin/STAR/raw/master/bin/Linux_x86_64_static/STAR" && chmod +x STAR && cp STAR /usr/local/bin/STAR
I think I'll just write a script to download all necessary runs locally and then copy them over to the DEE2 container for processing.
Another way to circonvent this problem without altering the docker image is to prefetch first using whichever SRA toolkit version you like, followed by running the docker image with the -d
parameter which searches the current working directory for sra archives.
docker run -v $(pwd):/dee2/mnt mziemann/tallyup hsapiens -d
I have been running it this way on our HPC as it keeps the CPUs busier. Could you give this a try?
Use prefetch like this
prefetch -X 9999999999999 -o ${ORG}_${SRR}.sra $SRR
where SRR
is the run accession, and ORG
is the species eg: hsapiens
Thanks. That's ulitmately what I ended doing.
I've set up a downloader pod on fast-network node with the latest SRA-Tools docker from https://hub.docker.com/r/ncbi/sra-tools and another pod running the pipeline image on high-memory node. The downloader pod saves SRA files to the same storage directory that the pipeline pod has mounted under /dee2/mnt and they can work asynchronously.
Hi Mark,
I was trying to rebuild the Docker image from scratch and it fails. I think that it's because pip upgrades itself to a version that is no longer compatible with Python 3.5 that comes with Ubuntu 16.04.
pip dropped support for Python 3.5 at version 21.0, see the changelog: https://pip.pypa.io/en/stable/news/#v21-0
and also in this thread where they had similar error message https://stackoverflow.com/questions/65869296/installing-pip-is-not-working-in-python-3-6
Below are selected relevant lines from docker build output:
Could you see if it fails for you as well?