veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
205 stars 69 forks source link

Cannot Parallelize - Segmentation fault with parallel or openMPI #1562

Closed NicMAlexandre closed 1 year ago

NicMAlexandre commented 1 year ago

Hello I have a few hundred commands on the command line I am trying to parallelize like so:

(echo 1; echo 5; echo 1; echo OG0022344.T2P2T.fasta; echo OG0022344.S.nwk; echo 5; echo 1; echo 3; echo 3; echo 3; echo 250; echo 1; echo OG0022344.S.json; echo OG0022344.S.params; echo 1; echo /dev/null) | hyphy -i > OG0022344.S.out

When I run directly in the login node, I get no problems with the expected output, but running with parallel or a helper script I get the following issues after the first line runs successfully:

/usr/bin/bash: line 1: 2097 Done ( echo 1; echo 5; echo 1; echo OG0020777.T2P2T.fasta; echo OG0020777.B.nwk; echo 5; echo 1; echo 3; echo 3; echo 3; echo 250; echo 1; echo OG0020777.B.json; echo OG0020777.B.params; echo 1; echo /dev/null ) 2098 Segmentation fault | hyphy -i > OG0020777.B.out

NicMAlexandre commented 1 year ago

I should note this error only happens in a subset of runs. Maybe one every 2 or 3. But when I run these commands on their own in the login node, they run fine.

NicMAlexandre commented 1 year ago

I have tried running this a different way like this:

hyphy busted --alignment OG0021597.T2P2T.fasta --tree OG0021597.B.nwk --branches Foreground

I am still getting segmentation fault:

Improving branch lengths, nucleotide substitution biases, and global dN/dS ratios under a full codon model

spond commented 1 year ago

Dear @nicolasalexandre21,

Segmentation faults indicate an internal HyPhy bug. Would you be able to send me the input files (OG0021597.T2P2T.fasta and OG0021597.B.nwk) so that I could check if latest version of HyPhy still has this issue?

What version are you running? ($hyphy --version)

Best, Sergei

NicMAlexandre commented 1 year ago

Hi Sergei,

Sorry for the late response! The version is "HYPHY 2.5.46(MP) for Linux on x86_64". OG0017040.B.nwk.txt OG0017040.nal.fasta.txt

Attached are an example alignment file (OG0017040.nal.fasta) and newick file (OG0017040.B.nwk) that produce this error.

spond commented 1 year ago

Dear @nicolasalexandre21,

I am not seeing errors with the current version (2.5.47) either on Linux or Mac OS X. Can you please update to 2.5.47 and check to see if the error persists?

If so, would you mind running the following commands (assuming you have the standard toolchain installed)?

$gdb hyphy
(gdb) run busted --alignment OG0017040.nal.fasta.txt --tree OG0017040.B.nwk.txt --branches Foreground
...
(assuming the program crashes)
bt

and then reporting what the debugger prints out?

Best, Sergei

NicMAlexandre commented 1 year ago

Hi Sergei,

This has worked perfectly, thank you for the tip. FYI, I was using the conda installation previously.

Nicolas

github-actions[bot] commented 1 year ago

Stale issue message