davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
716 stars 189 forks source link

MCL error causing downstream core dump #625

Closed dborgesr closed 3 years ago

dborgesr commented 3 years ago

Hello,

I am trying to run Orthofinder on 109 proteomes (longest isoform filtered) with the following command: orthofinder -a 1 -t 45 -S mmseqs -b ./OrthoFinder/Results_Sep24_3/

EDIT: forgot to mention, running this on an m5a.16xlarge w/ 256Gb of RAM, I assumed that would be enough? especially running it w/ 1 core.

and am bumping into the following error:

WARNING: program called by OrthoFinder produced output to stderr

Command: mcl /big_data/orthofinder_input/OrthoFinder/Results_Sep24_3/../Results_Sep26_7/WorkingDirectory/OrthoFinder_graph.txt -I 1.5 -o /big_data/orthofinder_input/OrthoFinder/Results_Sep24_3/../Results_Sep26_7/WorkingDirectory/clusters_OrthoFinder_I1.5.txt -te 1 -V all

stdout

b'' stderr

b'[mcl] cut <8> instances of overlap\n' 2021-09-28 10:04:54 : Ran MCL

Writing orthogroups to file

OrthoFinder assigned 2501200 genes (96.1% of total) to 94127 orthogroups. Fifty percent of all genes were in orthogroups with 160 or more genes (G50 was 160) and were contained in the largest 2749 orthogroups (O50 was 2749). There were 2 orthogroups with all species present and 0 of these consisted entirely of single-copy genes.

2021-09-28 10:13:11 : Done orthogroups

Analysing Orthogroups

Calculating gene distances

Bus error (core dumped)

I've tried resuming from post-Blast step but still arrive at the same error, i'm a bit of a loss as to what 'b'[mcl] cut <8> instances of overlap\n' means but it seems important?

--Thank you very much

dborgesr commented 3 years ago

Commenting to say i believe i traced the issue down to an issue with the BLAST run having run out of disk space and that causing a downstream coredump. Reran w/ a bigger disk and everything went swimmingly.

Closing!

VectorFrankenstein commented 2 years ago

Hi @dborgesr ,

By any chance do you remember if this would produce some partial output along the lines of the following files:

Comparative_Genomics_Statistics Gene_Trees Log.txt Orthogroups Orthogroup_Sequences Orthologues Single_Copy_Orthologue_Sequences WorkingDirectory

In my case the Orthologues directory is empty, and the other folders are missing. I am doing this on a remote machine that is not logging the crashed shell properly and as such unable to read the output but I did get the same MCL error as you and think I have a similar issue.

I am running 110 proteomes as well. Any information that you can pass along will be helpful.

Sincerely, Rijan