Open margarett opened 1 month ago
I ran all files one by one, and it all fails at the same point. Here is the log
Set cluster sensitivity to -s 1.000000
Set cluster mode SET COVER
Set cluster iterations to 1
intermediate_files/clustering/mmseqDB_clu.dbtype exists already!
RuleException:
CalledProcessError in file /home/bruno/bacLIFE/Snakefile, line 158:
Command 'set -euo pipefail; mmseqs cluster intermediate_files/clustering/mmseqDB intermediate_files/clustering/mmseqDB_clu intermediate_files/clustering/mmseqDB_temp --min-seq-id 0.95 --cov-mode 0 -c 0.8' returned non-zero exit status 1.
[Thu Oct 3 23:58:31 2024]
Error in rule clustering:
jobid: 0
input: intermediate_files/combined_proteins/combined_proteins.fasta
output: intermediate_files/clustering/binary_matrix.txt, intermediate_files/clustering/protein_cluster
Exiting because a job execution failed. Look above for error message
WorkflowError:
At least one job did not complete successfully.
[Thu Oct 3 23:58:31 2024]
Error in rule clustering:
jobid: 6
input: intermediate_files/combined_proteins/combined_proteins.fasta
output: intermediate_files/clustering/binary_matrix.txt, intermediate_files/clustering/protein_cluster
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Hi,
Does antismash fail to run when running separetly within the antismash_bacLIFE
environment? If that is the case is a problem of antismash, try to reinstall this conda env using conda instead of mamba helped me fix a similar issue I had.
The second error is a different issue. Remove the intermediate_files/clustering
folder. Sometimes some already created files from mmseq2 make the pipeline stop
hi @gguerr001 and thank you for your reply. I'm sorry it took so long to test it.
I restarted my WSL and redo the whole process, this time creating all environment with conda instead of mamba. (just a side note, bacLIFE downloads an insane amount of data before one can start testing it...) Unfortunately I got a similar (not exactly the same) error
(bacLIFE_environment) bruno@DESKTOP-ISHAQIB:~/bacLIFE$ Rscript src/rename_genomes.R data/ names_equivalence.txt
[1] TRUE TRUE TRUE TRUE TRUE
(bacLIFE_environment) bruno@DESKTOP-ISHAQIB:~/bacLIFE$ ls
CITATION.cff README.md Snakefile classifier_src data download intermediate_files src
ENVS Shiny_app app_example.zip config.json data_ori images names_equivalence.txt
(bacLIFE_environment) bruno@DESKTOP-ISHAQIB:~/bacLIFE$ snakemake -j 2 --use-conda
Assuming unrestricted shared filesystem usage.
host: DESKTOP-ISHAQIB
(... removed ...)
[Wed Oct 16 21:55:33 2024]
localrule directories:
(... removed, to get to the error and notice how long it's been running ...)
[Wed Oct 16 23:05:24 2024]
localrule antismash:
input: intermediate_files/annot/cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
output: intermediate_files/antismash/Saureus_cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
jobid: 26
reason: Missing output files: intermediate_files/antismash/Saureus_cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk; Input files updated by another job: intermediate_files/annot/cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
wildcards: genus=Saureus, species=cbs2016-05, str=X00001, replicon=O
resources: tmpdir=/tmp
Activating conda environment: antismash_bacLIFE
[Wed Oct 16 23:12:53 2024]
Error in rule antismash:
jobid: 26
input: intermediate_files/annot/cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
output: intermediate_files/antismash/Saureus_cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
conda-env: antismash_bacLIFE
shell:
antismash --cb-general --cb-knownclusters --cb-subclusters --output-dir intermediate_files/antismash/Saureus_cbs2016-05_X00001_O/ --asf --pfam2go --genefinding-tool prodigal --smcog-trees intermediate_files/annot/cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
However there's now a difference. The process was "hanged" at this point, it didn't drop to command line immediatelly as it did before, there's not the message "Shutting down it may take some time", the cursor was still blinking. I could tell by my task manager that WSL was still working hard, so I waited. But unfortunately after ~~10 minutes it actually failed
[Wed Oct 16 23:22:47 2024]
Finished job 22.
12 of 31 steps (39%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-16T215533.893112.snakemake.log
WorkflowError:
At least one job did not complete successfully.
Does antismash fail to run when running separetly within the
antismash_bacLIFE
environment? I tried to do this (not sure if it's supposed to work outside the script)(bacLIFE_environment) bruno@DESKTOP-ISHAQIB:~/bacLIFE$ conda deactivate (base) bruno@DESKTOP-ISHAQIB:~/bacLIFE$ conda activate antismash_bacLIFE (antismash_bacLIFE) bruno@DESKTOP-ISHAQIB:~/bacLIFE$ antismash --cb-general --cb-knownclusters --cb-subclusters --output-dir intermediate_files/antismash/Saureus_cbs2016-05_X00001_O/ --asf --pfam2go --genefinding-tool prodigal --smcog-trees intermediate_files/annot/cbs2016-05_X00001_O/Saureus_cbs2016-05_X00001_O.gbk
It "hanged" again for several minutes (cursor blinking nothing happening but CPU was working hard). After about ~~12-14min the command line appeared again without errors so I guess it worked fine?
I'm separating the comments for clarity (different issues) After the previous success I restarted the process with snakemake
The process was clearly different was went through the "antismash" job quickly (I could see it briefly) and started to show several messages in very quick progression (unlike before where it was all very very slow). Unfortunately it threw again an error
Total time = 4.517s
Reported 13924 pairwise alignments, 13924 HSPs.
3699 queries aligned.
RuleException:
CalledProcessError in file /home/bruno/bacLIFE/Snakefile, line 173:
Command 'set -euo pipefail; grep "^>" intermediate_files/clustering/unaligned.fasta > intermediate_files/clustering/unaligned_headers.txt' returned non-zero exit status 1.
[Wed Oct 16 23:38:30 2024]
Error in rule clustering:
jobid: 0
input: intermediate_files/combined_proteins/combined_proteins.fasta
output: intermediate_files/clustering/binary_matrix.txt, intermediate_files/clustering/protein_cluster
Exiting because a job execution failed. Look above for error message
WorkflowError:
At least one job did not complete successfully.
[Wed Oct 16 23:38:30 2024]
Error in rule clustering:
jobid: 14
input: intermediate_files/combined_proteins/combined_proteins.fasta
output: intermediate_files/clustering/binary_matrix.txt, intermediate_files/clustering/protein_cluster
The process was again "hanged" for several minutes, task manager still showing intense activity (not as high as before) and finally after ~~11min
[Wed Oct 16 23:49:47 2024]
Finished job 23.
5 of 18 steps (28%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-16T233757.225931.snakemake.log
WorkflowError:
At least one job did not complete successfully.
I did check line 173 in Snakefile but I have no idea where to go from there. Restart the process bring the error I posted in my second comment, so I deleted the folder as suggested, restarted again and another different error popped (now related to "bigscape" environment...
edit: ah, turns out this is issue #13 Well, I can now confirm that to delete the "intermediate_files/clustering" folder and restart the process is not really a solution and it throws several other errors. Maybe some individual files can be deleted?
Hi I receive the same error of the antismash rule when running snakemake. Running snakemake within the antismash_bacLIFE environment is not possible, as snakemake is not part of the antismash_bacLIFE environment. Antismash however is loaded in the anismash_bacLIFE environment. It seems, that snakemake is not using the correct environment. Any solutions?
I've seen the previous issue #15 but I can't fix my issue even with --j 1
I am running 5 files for analysis (I don't know what I'm actually analyzing, just helping a friend who has little clue about Linux and command line...) and I always get this error:
antismash is installed in environment antismash_bacLIFE and the folder "intermediate_files/antismash/Saureus_rn6390_X00005_O" is temporarily created (and removed after the error).
I am now trying to run a single file at a time. Also edited the Snakemake file to add --debug to the shell parameter.