NBISweden / aMeta

Ancient microbiome snakemake workflow
MIT License
19 stars 15 forks source link

Rule PMD_score failing with no error message #71

Closed ZoePochon closed 1 year ago

ZoePochon commented 2 years ago

The Rule PMD_score is failing sometimes with no error message.

Log file :

[Wed Jul 20 18:08:30 2022] Job 0: PMD_scores: COMPUTING PMD SCORES

Activating environment modules: GCC/10.2.0, OpenMPI/4.0.5, R/4.0.4, seqtk/1.3, SAMtools/1.12, MALT/0.5.3, HOPS/0.33, MEGAN/6.21.18, PMDtools/0.60 The following modules were not unloaded: (Use "module --force purge" to unload all):

1) snicenvironment 2) systemdefault [Wed Jul 20 18:09:02 2022] Error in rule PMD_scores: jobid: 0 output: results/AUTHENTICATION/Gok2cop_CGATGT_L007_merged_001.130313_SN344_0244_AC1U02ACXX/305/LN899819.1/305.PMDscores.txt log: logs/PMD_SCORES/Gok2cop_CGATGT_L007_merged_001.130313_SN344_0244_AC1U02ACXX_305_LN899819.1.log (check log file(s) for error message) shell: samtools view -h results/AUTHENTICATION/Gok2cop_CGATGT_L007_merged_001.130313_SN344_0244_AC1U02ACXX/305/LN899819.1/305.sorted.bam | pmdtools --number 100000 --printDS > results/AUTHENTICATION/Gok2cop_CGATGT_L007_merged_001.130313_SN344_0244_AC1U02ACXX/305/LN899819.1/305.PMDscores.txt (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job PMD_scores since they might be corrupted: results/AUTHENTICATION/Gok2cop_CGATGT_L007_merged_001.130313_SN344_0244_AC1U02ACXX/305/LN899819.1/305.PMDscores.txt Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message

ZoePochon commented 1 year ago

Still about rule PMD_scores (and rule Deamination). I found that they all work for Gok2 when using this change in the pipeline:

`git diff diff --git a/workflow/rules/authentic.smk b/workflow/rules/authentic.smk index 27d2553..f6730ee 100644 --- a/workflow/rules/authentic.smk +++ b/workflow/rules/authentic.smk @@ -177,7 +177,7 @@ rule PMD_scores: envmodules: *config["envmodules"]["malt"], shell:

Basically, it seems that the pmdtools environment seem not to be exactly the same as the pmdtools bash script. You can go check there: /proj/nobackup/metagenomics/pochonz/Gok002/.snakemake/log/ by comparing the two last log files there, the one from the 19th of August using the pmdtools environment and the 22nd of August using the suggested commands above. It's hard-coded though so we have to find another way to fix this

clami66 commented 1 year ago

Hey @ZoePochon, just to let you know that we have an idea on how we can fix this and will try to get it done next week!

clami66 commented 1 year ago

@ZoePochon I am trying to reproduce your issue without success. I am running manually this command on Kebnekaise:

samtools view -h results/AUTHENTICATION/Gok2cop_CGATGT_L007_merged_001.130313_SN344_0244_AC1U02ACXX/305/LN899819.1/305.sorted.bam | pmdtools --number 100000 --printDS

As you can see, I don't need to point directly at the pmdtools python script if I load the pmdtools module.

Just to double check, I am wondering if you remembered to export the envvar:

export ANCIENT_MICROBIOME_ENVMODULES=/proj/nobackup/metagenomics/envmodules.yaml

before your run?

clami66 commented 1 year ago

After discussing this on slack with @ZoePochon , it seems like this is an intermittent issue, probably not something about environment or python version. Running more tests and analyzing the logs is the way to go now, so we'll wait and see

clami66 commented 1 year ago

After some debugging we discovered that this issue was introduced by PMDtools, which will break after --number lines output by samtools and return. This in turn causes samtools to attempt and write into a closed pipe, which returns with an error status. This causes snakemake to fail, since bash is run with set -euo pipefail

Working on a fix now...

clami66 commented 1 year ago

Closing this for the time being since it looks like #97 fixes it