chanzuckerberg / idseq-workflows

Portable WDL workflows for IDseq production pipelines
https://idseq.net/
MIT License
31 stars 12 forks source link

Error while Running Test Example - Consensus Genome #142

Closed Samiah-Kanwar closed 3 years ago

Samiah-Kanwar commented 3 years ago

I successfully installed miniwdl and all other dependencies according to the steps mentioned on GitHub page . Now when I tried to run consensus genome test example using the command:

miniwdl run --verbose consensus-genome/run.wdl docker_image_id=idseq-consensus-genome fastqs_0=idseq-workflows/consensus-genome/test/sample_sars-cov-2_paired_r1.fastq.gz fastqs_1=idseq-workflows/consensus-genome/test/sample_sars-cov-2_paired_r2.fastq.gz sample=sample_sars-cov-2_paired technology=Illumina -i idseq-workflows/consensus-genome/test/local_test.yml

I am getting the following error:

miniwdl-run docker task rejected, desired state shutdown: invalid bind mount source, must be an absolute path: /tmp/miniwdl_download_cache/files/s3/idseq-public-references/_consensus-genome/human_chr1.fa :: error: "RuntimeError", dir: "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_9Jul2021/idseq-workflows/20210713_122226_consensus_genome", from_dir: "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_9Jul2021/idseq-workflows/20210713_122226_consensus_genome/call-RemoveHost"

Kindly help me in this regard. I will be available for providing further information

mlin commented 3 years ago

@Samiah-Kanwar Apologies for the slow reply, my colleagues alerted me to this question but I lost track of it in the meantime.

Will you please confirm the operating system(s) involved in the environment here? In particular, is Windows involved in some way? The known history of this error message seems related to using Docker on Windows, but it will be interesting if that's not the case here.

(Bookmarking source of the error message)

Samiah-Kanwar commented 3 years ago

Hi @mlin thank you for your response. I am using Linux (Ubuntu 21.04), in my case Windows is not involved in any way. The whole system is Linux based.

mlin commented 3 years ago

@Samiah-Kanwar thanks; this is mysterious, then. Can you post the complete log file? (It should be left behind as workflow.log inside the run directory) Lacking strong leads, I'd just like to look for any other clues about what might be going wrong. Even better, if you can run it again with --debug that will leave the maximum amount of detail behind in the log file.

Samiah-Kanwar commented 3 years ago

workflow.log Dear @mlin , I ran the pipeline again with --debug and attached the workflow.log file.

Samiah-Kanwar commented 3 years ago

Screenshot from 2021-07-29 17-43-18

Also find attached the image of the folder, the folder name looks weird,

mlin commented 3 years ago

@Samiah-Kanwar Thanks, indeed the weird folder name has something to do with it. I can see in the log file that the environment variable MINIWDL__DOWNLOAD_CACHE__DIR seems to be set to \u0003\u0003\u0003\u0003\u0003\u0003/tmp/miniwdl_download_cache with these six extraneous control characters preceding /tmp/miniwdl_download_cache (which is where it's supposed to start). I don't know how this could have happened. Can you please try at the command line

export MINIWDL__DOWNLOAD_CACHE__DIR=/tmp/miniwdl_download_cache

and then miniwdl run again?

We can make some usability improvements in miniwdl to sanity-check the setting of that environment variable so that the error message isn't so confusing. But I can't think of how those control characters could have gotten there.

Samiah-Kanwar commented 3 years ago

Thank you very much @mlin . I tried the command:

export MINIWDL__DOWNLOAD_CACHE__DIR=/tmp/miniwdl_download_cache

And then miniwdl run . And thankfully this time the Absolute Path error removed but now I am facing new error which is related to Null value . Following is the exact error and I have also attached the workflow.log file.

2021-07-30 14:37:06.537 wdl.w:consensus_genome env :: node: "call-FilterReads", values: {"kraken2_db_tar_gz": "s3://idseq-public-references/consensus-genome/kraken_coronavirus_db_only.tar.gz", "RemoveHost.host_removed_fastqs": ["/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_miniwdl/20210730_143628_consensus_genome/call-RemoveHost/out/host_removed_fastqs/0/no_host_1.fq.gz", "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_miniwdl/20210730_143628_consensus_genome/call-RemoveHost/out/host_removed_fastqs/1/no_host_2.fq.gz"], "prefix": "", "FetchSequenceByAccessionId.sequence_fa": null, "ref_fasta": null, "docker_image_id": "idseq-consensus-genome"}
2021-07-30 14:37:06.540 wdl.w:consensus_genome Traceback (most recent call last):
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/runtime/workflow.py", line 802, in _workflow_main_loop
    next_call = state.step(cfg, stdlib)
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/runtime/workflow.py", line 266, in step
    raise exn
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/runtime/workflow.py", line 263, in step
    res = self._do_job(cfg, stdlib, job)
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/runtime/workflow.py", line 365, in _do_job
    call_inputs = call_inputs.bind(name, expr.eval(env, stdlib=stdlib))
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/Expr.py", line 118, in eval
    ans = self._eval(env, stdlib)
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/Expr.py", line 1113, in _eval
    return f(self, env, stdlib)
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/StdLib.py", line 232, in __call__
    return self._call_eager(expr, [arg.eval(env, stdlib=stdlib) for arg in expr.arguments])
  File "/home/samiahkanwar/anaconda3/lib/python3.7/site-packages/WDL/StdLib.py", line 745, in _call_eager
    raise Error.NullValue(expr)
WDL.Error.NullValue: Null value

2021-07-30 14:37:06.541 wdl.w:consensus_genome workflow consensus_genome (idseq-workflows/consensus-genome/run.wdl Ln 9 Col 1) failed :: dir: "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_miniwdl/20210730_143628_consensus_genome", error: "NullValue", message: "Null value", node: "call-FilterReads", pos: {"source": "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_miniwdl/idseq-workflows/consensus-genome/run.wdl", "line": 117, "column": 33}
2021-07-30 14:37:06.541 wdl.w:consensus_genome aborting workflow
2021-07-30 14:37:06.543 miniwdl-run Null value :: error: "NullValue", node: "call-FilterReads", pos: {"source": "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_miniwdl/idseq-workflows/consensus-genome/run.wdl", "line": 117, "column": 33}, dir: "/home/samiahkanwar/Desktop/AKU_System/IDSeqPipeline_miniwdl/20210730_143628_consensus_genome"

I think the error is coming from run.wdl line 117 in FilterReads call.

                    ref_fasta = select_first([ref_fasta, FetchSequenceByAccessionId.sequence_fa]),
                    kraken2_db_tar_gz = kraken2_db_tar_gz,
                    docker_image_id = docker_image_id

Please help me!

workflow.log

mlin commented 3 years ago

@Samiah-Kanwar Thanks for the update, glad we're at least making progress. I spoke with some of my colleagues today and we think the wiki instructions are probably a little out of date, leading to this problem. We will be going through them to check in the next couple of days and I'll update you here. Sorry for the inconvenience!

Samiah-Kanwar commented 3 years ago

Thank you very much @mlin . Yes and I highly appreciate your efforts. Sure I will be waiting.

rzlim08 commented 3 years ago

Hi @Samiah-Kanwar thanks for bringing this up, we were missing the ref_fasta=s3://idseq-public-references/consensus-genome/MN908947.3.fa parameter in the test example. I've updated the readme here, so running that should work now.

Samiah-Kanwar commented 3 years ago

Hi @rzlim08 thank you very much, I ran the updated command and it worked, the whole pipeline ran successfully on test example. Once again thank you very much @mlin and @rzlim08 , you guys are awesome.