phylo42 / EPIK

Alignment-free phylogenetic placement
MIT License
4 stars 1 forks source link

ipk ran successfully, but epik threw the following error: terminate called after throwing an instance of 'boost::archive::archive_exception' #14

Closed Ufungi closed 4 months ago

Ufungi commented 5 months ago

(epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/ipk$ python ipk.py build -w ./ \

-r Tricholoma_ref_aln.fasta \ -t Tricholoma_ref_aln.fasta.raxml.bestTree.rooted \ -m GTR -a 0.42 -k 10 \ --no-reduction \ --threads 32

epik.py place -i ./DB_k10_o1.5.rps -s nucl -o ./ ./Tricholoma_query.fasta --threads 32/home/genome/anaconda3/envs/epik/bin/ipk-dna --ar-binary /home/genome/anaconda3/envs/epik/bin/raxml-ng --refalign Tricholoma_ref_aln.fasta -t Tricholoma_ref_aln.fasta.raxml.bestTree.rooted -w ./ -k 10 --alpha 0.42 --categories 4 --reduction-ratio 0.99 -o 1.5 --no-filter --both -u 1.0 -j 32 --model GTR --no-reduction Loading the reference alignment: Tricholoma_ref_aln.fasta Loaded and filtered 119 sequences.

Loading newick: Tricholoma_ref_aln.fasta.raxml.bestTree.rooted Loaded a tree of 237 nodes.

Loading newick: Tricholoma_ref_aln.fasta.raxml.bestTree.rooted Loaded a tree of 237 nodes.

Saving alignment to ./extended_trees/extended_align.fasta... Saving alignment to ./extended_trees/extended_align.phylip... Running: /home/genome/anaconda3/envs/epik/bin/raxml-ng --ancestral --msa ./extended_trees/extended_align.phylip --tree ./extended_trees/extended_tree.newick --threads 32 --precision 9 --seed 1 --force msa --redo --model GTR+G4{0.420000}+IU{0}+FC --blopt nr_safe --opt-model on --opt-branches on

RAxML-NG v. 1.2.1 released on 22.12.2023 by The Exelixis Lab. Developed by: Alexey M. Kozlov and Alexandros Stamatakis. Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth, Julia Haag, Anastasis Togkousidis. Latest version: https://github.com/amkozlov/raxml-ng Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml

System: AMD Ryzen Threadripper PRO 5975WX 32-Cores, 32 cores, 503 GB RAM

RAxML-NG was called at 11-Apr-2024 09:25:07 as follows:

/home/genome/anaconda3/envs/epik/bin/raxml-ng --ancestral --msa ./extended_trees/extended_align.phylip --tree ./extended_trees/extended_tree.newick --threads 32 --precision 9 --seed 1 --force msa --redo --model GTR+G4{0.420000}+IU{0}+FC --blopt nr_safe --opt-model on --opt-branches on

Analysis options: run mode: Ancestral state reconstruction start tree(s): user random seed: 1 tip-inner: ON pattern compression: OFF per-rate scalers: OFF site repeats: OFF logLH epsilon: general: 10.000000, brlen-triplet: 1000.000000 branch lengths: proportional (ML estimate, algorithm: NR-SAFE) SIMD kernels: AVX2 parallelization: coarse-grained (auto), PTHREADS (32 threads), thread pinning: OFF

WARNING: Running in REDO mode: existing checkpoints are ignored, and all result files will be overwritten!

WARNING: Running in FORCE mode: some safety checks are disabled!

[00:00:00] Reading alignment from file: ./extended_trees/extended_align.phylip [00:00:00] Loaded alignment with 591 taxa and 1240 sites

Alignment comprises 1 partitions and 1240 sites

Partition 0: noname Model: GTR+FC+IU{0}+G4m{0.42} Alignment sites: 1240 Gaps: 89.99 % Invariant sites: 38.87 %

NOTE: Binary MSA file created: ./extended_trees/extended_align.phylip.raxml.rba

Parallelization scheme autoconfig: 1 worker(s) x 32 thread(s)

[00:00:00] Loading user starting tree(s) from: ./extended_trees/extended_tree.newick Parallel reduction/worker buffer size: 1 KB / 0 KB

[00:00:00] Data distribution: max. partitions/sites/weight per thread: 1 / 39 / 624 [00:00:00] Data distribution: max. searches per worker: 1

Starting ML tree search with 1 distinct starting trees

[00:00:00] Tree #1, initial LogLikelihood: -19390.943248037

[00:00:00 -19390.943248037] Initial branch length optimization [00:00:00 -19360.310990217] Model parameter optimization (eps = 10.000000000)

[00:00:00] Tree #1, final logLikelihood: -18684.574885272

Optimized model parameters:

Partition 0: noname Rate heterogeneity: GAMMA (4 cats, mean), alpha: 0.420000000 (user), weights&rates: (0.250000000,0.019702310) (0.250000000,0.196531673) (0.250000000,0.752046357) (0.250000000,3.031719660) P-inv (user): 0.000000000 Base frequencies (empirical): 0.237376835 0.208498576 0.220920724 0.333203865 Substitution rates (ML): 1.712232119 4.010666061 1.712099352 0.639637079 5.041377406 1.000000000

Marginal ancestral probabilities saved to: /data/genome/run/snyoo/ipk/extended_trees/extended_align.phylip.raxml.ancestralProbs Reconstructed ancestral sequences saved to: /data/genome/run/snyoo/ipk/extended_trees/extended_align.phylip.raxml.ancestralStates Node-labeled tree saved to: /data/genome/run/snyoo/ipk/extended_trees/extended_align.phylip.raxml.ancestralTree

Execution log saved to: /data/genome/run/snyoo/ipk/extended_trees/extended_align.phylip.raxml.log

Analysis started: 11-Apr-2024 09:25:07 / finished: 11-Apr-2024 09:25:10

Elapsed time: 2.574 seconds

Ancestral reconstruction results have been found: ./extended_trees/extended_align.phylip.raxml.ancestralProbs ./extended_trees/extended_align.phylip.raxml.ancestralTree Loading RAXML-NG results: ./extended_trees/extended_align.phylip.raxml.ancestralProbs... Loaded 561 matrices of 1240 rows. Time (ms): 206

Loading newick: ./extended_trees/extended_align.phylip.raxml.ancestralTree Loaded a tree of 1180 nodes.

Saving tree to ./AR/ar_tree_rerooted.newick... Construction parameters: Sequence type: DNA k: 10 omega: 1.5 Keep positions: false

Building database: [stage 1 / 2]: [============================================================] 235/236 Calculated 90854647 phylo-k-mers. Calculation time: 9993

Building database: [stage 2 / 2]: Kept 1002195 / 1002195 k-mers (100%) | 63171645 / 63171645 entries (100%). Filtering time: 6044

Building database: Done. Built 63171645 phylo-k-mers for 1002195 different k-mers. Total time (ms): 16037

Saving database to: ./DB_k10_o1.5.rps... Compression: ON Boost version: 1.84.0 Time (ms): 10528

(epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/ipk$ (epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/ipk$ cp DB_k10_o1.5.rps ../epik (epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/ipk$ (epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/ipk$ cd ../epik (epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/epik$ (epik) genome@limsfep-zen-32c:/data/genome/run/snyoo/epik$ epik.py place -i ./DB_k10_o1.5.rps -s nucl -o ./ ./Tricholoma_query.fasta --threads 32 /home/genome/anaconda3/envs/epik/bin/epik-dna -d ./DB_k10_o1.5.rps -q ./Tricholoma_query.fasta -j 32 --omega 1.5 --mu 1.0 -o ./ ./Tricholoma_query.fasta Loading database with mu=1 and omega=1.5... Boost version: 1.84.0 terminate called after throwing an instance of 'boost::archive::archive_exception' what(): class version N3i2l11_pkdb_valueILb0EEE

nromashchenko commented 5 months ago

Hello! Thanks for reporting this. Could you specify the operating system you use and the way IPK and EPIK were compiled?

blinard-BIOINFO commented 5 months ago

@nromashchenko Exact same error as in #12

terminate called after throwing an instance of 'boost::archive::archive_exception'
 what():  class version N3i2l11_pkdb_valueILb0EEE

I know that it is expected that the exception get throwed when a different boost version was used to serialize and deserialize, however, this is not what happens in my case, the same boost version in used for compiling IPK and EPIK, also the same in the conda build...

Could that be related to different versions of any other library used on IPK vs EPIK side ?

nromashchenko commented 5 months ago

This must be indeed the same as #12. The issue is on the IPK side, not EPIK (issue). We fixed it. When CI and some tests are done, we'll release a new version of IPK.

Ufungi commented 4 months ago

I hope it gets resolved soon!

nromashchenko commented 4 months ago

You should rebuild the database with the new release 0.5.1 of IPK and try again.