Pipeline continue downloading essential files at every run

Samiah-Kanwar commented 2 years ago

Hey @rzlim08, I successfully installed latest idseq-workflow . Now I run the following command but primer, genome and other files start downloading every time and taking hours and hours to run.

time miniwdl run --verbose idseq-workflows/consensus-genome/run.wdl docker_image_id=idseq-consensus-genome fastqs_0= SARSCoV2_firstBatch/S11_L001_R1_001.fastq.gz fastqs_1= SARSCoV2_firstBatch/S11_L001_R2_001.fastq.gz sample= S11 technology=Illumina ref_fasta=s3://idseq-public-references/consensus-genome/MN908947.3.fa -i idseq-workflows/consensus-genome/test/local_test.yml --debug

In fact, I have all files already downloaded in /tmp/miniwdl_download_cache/files/s3/idseq-public-references/_consensus-genome but pipeline start downloading these all again and abort with the error (sometime kraken_coronavirus_db_only.tar.gz file not found and sometime hg38.fa.gz file not found). For this I manually pasted essential files. But nothing worked.

PS: I always run export MINIWDL__DOWNLOAD_CACHE__DIR=/tmp/miniwdl_download_cache prior to run main command (mentioned above).

Kindly help

The previous version of idseq-workflow was working fine on my workstation but I am facing difficulties in its latest update.

rzlim08 commented 2 years ago

Hi Samiah, in addition to MINIWDL__DOWNLOAD_CACHE__DIR you may have to set

export MINIWDL__DOWNLOAD_CACHE__PUT=true
export MINIWDL__DOWNLOAD_CACHE__GET=true

I think with newer versions of miniwdl you can also use miniwdl configure @mlin may be able to help out more though!

Samiah-Kanwar commented 2 years ago

Thank you @rzlim08! it worked.

chanzuckerberg / idseq-workflows

Pipeline continue downloading essential files at every run #179