c5shen / WITCH

WITCH is a multiple sequence alignment method that uses multiple weighted HMMs to align unaligned sequences and find consensuses.
GNU General Public License v3.0
10 stars 3 forks source link

Error finding HMMsearch results when trying to re-run after termination of WITCH due to unexpected server down #12

Open Ufungi opened 1 month ago

Ufungi commented 1 month ago

I tried to to re-run WITCH after termination due to unexpected server down, but WITCH fails to retrieve file. In fact, there is no directory named /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_0 However, /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/backbone/backbone.aln.fasta exists.

python3 /data/genome/run/snyoo/programs/WITCH/witch.py \

-q /data/genome/run/snyoo/programs/appspam/Bacteria/Bacteria_query.fasta \ -b /data/genome/run/snyoo/programs/WITCH/results/ezbiocloud/ezbiocloud_DB_aligned.fasta \ -e /data/genome/run/snyoo/programs/WITCH/results/ezbiocloud/ezbiocloud_DB_aligned.fasta.raxml.bestTree \ -d /data/genome/run/snyoo/programs/WITCH/results/Bacteria \ -o ezbiocloud_DB_query_aligned.fasta \ -d /data/genome/run/snyoo/programs/WITCH/results/Bacteria \ -o ezbiocloud_DB_query_aligned.fasta \ -t 16 \ --save-weight 1 \ --keep-decomposition 1

***** Configurations ** home.path: /data/genome/run/snyoo/programs/WITCH/witch_msa/home.path main.config: /home/genome/.witch_msa/main.config

    Configs.hmmdir: None
    Configs.input_path: None
    Configs.backbone_path: /data/genome/run/snyoo/programs/WITCH/results/ezbiocloud/ezbiocloud_DB_aligned.fasta
    Configs.backbone_tree_path: /data/genome/run/snyoo/programs/WITCH/results/ezbiocloud/ezbiocloud_DB_aligned.fasta.raxml.bestTree
    Configs.query_path: /data/genome/run/snyoo/programs/appspam/Bacteria/Bacteria_query.fasta
    Configs.outdir: /data/genome/run/snyoo/programs/WITCH/results/Bacteria
    Configs.output_path: /data/genome/run/snyoo/programs/WITCH/results/Bacteria/ezbiocloud_DB_query_aligned.fasta
    Configs.chunksize: 1
    Configs.keeptemp: False
    Configs.keep_decomposition: True
    Configs.mode: witch-ng
    Configs.num_hmms: 10
    Configs.use_weight: True
    Configs.save_weight: True
    Configs.alignment_size: 10
    Configs.alignment_upper_bound: None
    Configs.num_cpus: 16
    Configs.max_concurrent_jobs: 160
    Configs.molecule: None
    Configs.magus_path: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/magus.py
    Configs.mafftpath: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/mafft/mafft
    Configs.fasttreepath: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/fasttree/FastTreeMP
    Configs.mclpath: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/mcl/bin/mcl
    Configs.hmmsearchpath: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/hmmer/hmmsearch
    Configs.hmmalignpath: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/hmmer/hmmalign
    Configs.hmmbuildpath: /data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/hmmer/hmmbuild
    Configs.bypass_setup: False
    Configs.log_path: /data/genome/run/snyoo/programs/WITCH/results/Bacteria/log.txt
    Configs.error_path: /data/genome/run/snyoo/programs/WITCH/results/Bacteria/error.txt
    Configs.debug_path: /data/genome/run/snyoo/programs/WITCH/results/Bacteria/debug.txt
    Configs.runtime_path: /data/genome/run/snyoo/programs/WITCH/results/Bacteria/runtime_breakdown.txt
    Configs.keepgcmtemp: False
    Configs.inflation_factor: 4
    Configs.graphclustermethod: mcl
    Configs.graphtracemethod: minclusters
    Configs.graphtraceoptimize: false
    Configs.timeout: 120
    Configs.Backbone: Namespace(alignment_method='magus', alignment_path='', backbone_size='', selection_strategy='median_length', tree_method='FastTree2', tree_path='')
    Configs.MAGUS: Namespace(inflationfactor='', graphclustermethod='', graphtracemethod='', graphtraceoptimize='', maxnumsubsets='', mafftpath='/data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/mafft/mafft', mclpath='/data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/mcl/bin/mcl', hmmsearchpath='/data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/hmmer/hmmsearch', hmmbuildpath='/data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/hmmer/hmmbuild', hmmalignpath='/data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/hmmer/hmmalign', fasttreepath='/data/genome/run/snyoo/programs/WITCH/witch_msa/tools/magus/tools/fasttree/FastTreeMP')

Found existing HMM directory: /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root Traceback (most recent call last): File "/data/genome/run/snyoo/programs/WITCH/witch.py", line 5, in witch_runner() File "/data/genome/run/snyoo/programs/WITCH/witch_msa/init.py", line 40, in witch_runner mainAlignmentProcess(args) File "/data/genome/run/snyoo/programs/WITCH/witch_msa/gcmm/gcmm.py", line 160, in mainAlignmentProcess = _dummy_search.readHMMDirectory(lock, pool) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/data/genome/run/snyoo/programs/WITCH/witch_msa/gcmm/algorithm.py", line 239, in readHMMDirectory tmp_map[0])


KeyError: 0

---------------------------------------------------------------------------------------------------------------------------------------------------
Below is the part of the debug.txt file
---------------------------------------------------------------------------------------------------------------------------------------------------

(witch) genome@limsfep-zen-32c:/data/genome/run/snyoo/programs/WITCH/results/Bacteria$ head debug.txt
2024-05-04 13:43:10     [DEBUG] breaking_edge length = 0.019564, centroid
2024-05-04 13:43:11     [DEBUG] Tree 1 has 24588 nodes, tree 2 has 37240 nodes
2024-05-04 13:43:12     [DEBUG] breaking_edge length = 0.045472, centroid
2024-05-04 13:43:13     [DEBUG] Tree 1 has 12669 nodes, tree 2 has 11919 nodes
2024-05-04 13:43:13     [DEBUG] breaking_edge length = 0.028663, centroid
2024-05-04 13:43:13     [DEBUG] Tree 1 has 7700 nodes, tree 2 has 4969 nodes
2024-05-04 13:43:13     [DEBUG] breaking_edge length = 0.018208, centroid
2024-05-04 13:43:13     [DEBUG] Tree 1 has 3396 nodes, tree 2 has 4304 nodes
2024-05-04 13:43:13     [DEBUG] breaking_edge length = 0.018403, centroid
2024-05-04 13:43:14     [DEBUG] Tree 1 has 1819 nodes, tree 2 has 1577 nodes
(witch) genome@limsfep-zen-32c:/data/genome/run/snyoo/programs/WITCH/results/Bacteria$ tail debug.txt
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17166 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17166
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17171 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17171
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17170 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17170
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17168 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17168
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17172 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17172
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17173 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17173
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17169 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17169
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17175 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17175
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17174 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17174
2024-05-11 05:39:55     [DEBUG] Finished dealing with subset #17176 from /data/genome/run/snyoo/programs/WITCH/results/Bacteria/tree_decomp/root/A_0_17176
Ufungi commented 1 month ago

Might be the same problem with this?

c5shen commented 1 month ago

It is not related to the previous issue you raised, but rather that I did not write a checkpoint system for HMMBuild/HMMSearch phases (was on my TODO list). I may in the future add that in so that even in the case of system accidental shutdown, progress can be saved and retrieved (currently, if WITCH sees the tree_decomp folder exist, it will assume that all HMMBuild/HMMSearch jobs are done).

For this issue, I can only recommend you to either: 1) delete the tree_decomp folder and rerun WITCH with the same command, or 2) delete the entire WITCH output folder and start fresh.