rrwick / Trycycler

A tool for generating consensus long-read assemblies for bacterial genomes
GNU General Public License v3.0
306 stars 28 forks source link

segmentation fault error #48

Closed steinbrl closed 1 year ago

steinbrl commented 1 year ago

Hi,

I build Trycyler in a pipeline. With some datasets, its working flawless, but wit some datasets, it produces, reproduceable, the following error:

Building distance matrix (2022-11-14 12:22:29) Mash is used to build a distance matrix of all contigs in the assemblies.

A_contig_1: 0.000 0.002 B_Utg38: 0.002 0.000

Clustering (2022-11-14 12:22:29) The contigs are now split into clusters using a complete-linkage hierarchical approach.

trycycler/cluster_001/1_contigs: trycycler/cluster_001/1_contigs/A_contig_1.fasta: 127,851 bp, 39.9x trycycler/cluster_001/1_contigs/B_Utg38.fasta: 126,371 bp, 41.8x

Building FastME tree (2022-11-14 12:22:29) R (ape and phangorn) are used to build a FastME tree of the relationships between the contigs.

saving distance matrix: trycycler/contigs.phylip saving tree: trycycler/contigs.newick

caught segfault address 0x38, cause 'memory not mapped'

Traceback: 1: fastme.bal(distances) An irrecoverable exception occurred. R is aborting now ... Traceback (most recent call last): File "/home/leinehome/mh-hannover.local/steinbrl/.conda/envs/Trycycler/bin/trycycler", line 10, in sys.exit(main()) File "/home/leinehome/mh-hannover.local/steinbrl/.conda/envs/Trycycler/lib/python3.10/site-packages/trycycler/main.py", line 41, in main cluster(args) File "/home/leinehome/mh-hannover.local/steinbrl/.conda/envs/Trycycler/lib/python3.10/site-packages/trycycler/cluster.py", line 43, in cluster build_tree(seq_names, seqs, depths, matrix, args.out_dir, cluster_numbers) File "/home/leinehome/mh-hannover.local/steinbrl/.conda/envs/Trycycler/lib/python3.10/site-packages/trycycler/cluster.py", line 264, in build_tree subprocess.check_output(['Rscript', tree_script]) File "/home/leinehome/mh-hannover.local/steinbrl/.conda/envs/Trycycler/lib/python3.10/subprocess.py", line 420, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/home/leinehome/mh-hannover.local/steinbrl/.conda/envs/Trycycler/lib/python3.10/subprocess.py", line 524, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['Rscript', '/tmp/tmpcahstqdr/tree.R']' died with <Signals.SIGSEGV: 11>.

Do you have any idea, what is triggering this? The script ran on a HPC node, with 64 cores, 128GB RAM und Centros, controlled by SLURM.

Best wishes,

Lars

rrwick commented 1 year ago

It seems as though the crash is happening in R (in the fastme.bal function), not in Trycycler's code. So I'm afraid I can't really debug this one.

You could try:

microbemarsh commented 1 year ago

Hey @steinbrl I'm having the same issue with trycycler v0.5.4, did you ever resolve this issue? I'm using a very similar setup (SLURM HPC node) and I'm getting the same error code.