Closed GroovyLooper closed 3 years ago
Hi Max
The problem is linked to the initial warning messages in your output. It's looks like there were not enough homologous sequences in your input files in order for OrthoFidner to run the analysis. Did you include all the species protein sequences in your input files?
All the best David
Hi David, First, thank you for the quick reply. I did not include the complete proteome for any of the species as there is a certain protein that I am looking at. I've BLASTed that protein and have taken the results from the BLAST, separated them by species and am now attempting to run them through Orthofinder. There are 46 sequences between 8 species (with only one species having only one gene of interest), which I thought should be enough for Orthofinder to recognize. Additionally, Orthofinder seems to work when I remove some of the species and genes of interest (7 species and 25 genes), although I still get the error telling me that there are too few hits between species x and species y.
With 7 species and 25 genes (most of these sequences being the same as those run in the first post above), this is my output:
`OrthoFinder version 2.4.0 Copyright (C) 2014 David Emms
2020-11-09 12:32:29 : Starting OrthoFinder 40 thread(s) for highly parallel tasks (BLAST searches etc.) 1 thread(s) for OrthoFinder algorithm
Test can run "mcl -h" - ok Test can run "fastme -i /home/immunoviromics/antiviral_shared/Max/BLASTloopMDA5/MDA5Output_e-20_c75/species/OrthoFinder/Results_Nov09_5/WorkingDirectory/SimpleTest.phy -o /home/immunoviromics/antiviral_shared/Max/BLASTloopMDA5/MDA5Output_e-20_c75/species/OrthoFinder/Results_Nov09_5/WorkingDirectory/SimpleTest.tre" - ok
2020-11-09 12:32:30 : Creating diamond database 1 of 7 2020-11-09 12:32:30 : Creating diamond database 2 of 7 2020-11-09 12:32:30 : Creating diamond database 3 of 7 2020-11-09 12:32:30 : Creating diamond database 4 of 7 2020-11-09 12:32:30 : Creating diamond database 5 of 7 2020-11-09 12:32:30 : Creating diamond database 6 of 7 2020-11-09 12:32:30 : Creating diamond database 7 of 7
Using 40 thread(s) 2020-11-09 12:32:30 : This may take some time.... 2020-11-09 12:32:30 : Done 0 of 49 2020-11-09 12:32:30 : Done 10 of 49 2020-11-09 12:32:31 : Done all-versus-all sequence search
2020-11-09 12:32:31 : Initial processing of each species 2020-11-09 12:32:31 : Initial processing of species 0 complete 2020-11-09 12:32:31 : Initial processing of species 1 complete WARNING: Too few hits between species 2 and species 2 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 2 and species 4 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 2 and species 5 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 2 and species 6 to normalise the scores, these hits will be ignored 2020-11-09 12:32:31 : Initial processing of species 2 complete 2020-11-09 12:32:31 : Initial processing of species 3 complete WARNING: Too few hits between species 4 and species 2 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 4 and species 4 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 4 and species 5 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 4 and species 6 to normalise the scores, these hits will be ignored 2020-11-09 12:32:31 : Initial processing of species 4 complete WARNING: Too few hits between species 5 and species 2 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 5 and species 4 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 5 and species 5 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 5 and species 6 to normalise the scores, these hits will be ignored 2020-11-09 12:32:31 : Initial processing of species 5 complete WARNING: Too few hits between species 6 and species 2 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 6 and species 4 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 6 and species 5 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 6 and species 6 to normalise the scores, these hits will be ignored 2020-11-09 12:32:31 : Initial processing of species 6 complete 2020-11-09 12:32:34 : Connected putative homologues 2020-11-09 12:32:34 : Written final scores for species 0 to graph file 2020-11-09 12:32:34 : Written final scores for species 1 to graph file 2020-11-09 12:32:34 : Written final scores for species 2 to graph file 2020-11-09 12:32:34 : Written final scores for species 3 to graph file 2020-11-09 12:32:34 : Written final scores for species 4 to graph file 2020-11-09 12:32:34 : Written final scores for species 5 to graph file 2020-11-09 12:32:34 : Written final scores for species 6 to graph file 2020-11-09 12:32:34 : Ran MCL
OrthoFinder assigned 25 genes (100.0% of total) to 4 orthogroups. Fifty percent of all genes were in orthogroups with 5 or more genes (G50 was 5) and were contained in the largest 2 orthogroups (O50 was 2). There were 1 orthogroups with all species present and 0 of these consisted entirely of single-copy genes.
2020-11-09 12:32:34 : Done orthogroups
Exception RuntimeError: RuntimeError('cannot join current thread',) in <Finalize object, dead> ignored 2020-11-09 12:32:36 : Done Using fallback species tree inference method
2020-11-09 12:32:37 : Starting STRIDE 2020-11-09 12:32:37 : Done STRIDE Observed 0 well-supported, non-terminal duplications. 0 support the best roots and 0 contradict them. Best outgroups for species tree: Mnemiopsis_leidyi Amphimedon_queenslandica, Mnemiopsis_leidyi, Hofstenia_miamia Amphimedon_queenslandica Amphimedon_queenslandica, Hofstenia_miamia Exaiptasia_Pallida Trichoplax_adhaerens Aurelia_aurita, Exaiptasia_Pallida, Calvadosia_cruxmelitensis Hofstenia_miamia Calvadosia_cruxmelitensis Aurelia_aurita Aurelia_aurita, Calvadosia_cruxmelitensis
WARNING: Multiple potential species tree roots were identified, only one will be analyed.
Outgroup: Mnemiopsis_leidyi 2020-11-09 12:32:37 : Starting Recon and orthologues 2020-11-09 12:32:37 : Starting OF Orthologues 2020-11-09 12:32:37 : Done 0 of 4 2020-11-09 12:32:37 : Done OF Orthologues 2020-11-09 12:32:37 : Done Recon
2020-11-09 12:32:37 : Done orthologues
Results: /home/immunoviromics/antiviral_shared/Max/BLASTloopMDA5/MDA5Output_e-20_c75/species/OrthoFinder/Results_Nov09_5/
CITATION: When publishing work that uses OrthoFinder please cite: Emms D.M. & Kelly S. (2019), Genome Biology 20:238
If you use the species tree in your work then please also cite: Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278 Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914 Exception RuntimeError: RuntimeError('cannot join current thread',) in <Finalize object, dead> ignored `
As you can see, there are fewer genes being analyzed here and yet it seems to work, which confuses me as to why Orthofinder wouldn't work with nearly double the number of input sequences. Does this then mean that this output is somehow inaccurate?
Thank you, Max
Hi Max
Yes, OrthoFinder will have problems inferring an accurate species tree and identifying the correct root with so little data. It will also have problems at the orthogroup inference stage correcting for the divergence between species. E.g the question "are two genes orthologs from distantly related species of more anciently diverging paralogs from closely related species" is difficult to answer if there's not a pool of other genes to compare with. I'd definitely recommend providing all genes, it should run in less time than it takes to get an answer on github ;)
All the best David
All the best David
Hello, I've been having some trouble with Orthofinder 2.4.0 lately. When I attempt to run orthofinder, it seems that I am getting multiple errors in the "Analyzing Orthogroups" section. I've looked through many of the other help threads but I cannot find an answer to this problem. Any help would be greatly appreciated.
Here is my input and my output.
orthofinder -f species/
OrthoFinder version 2.4.0 Copyright (C) 2014 David Emms
2020-11-08 12:54:29 : Starting OrthoFinder 40 thread(s) for highly parallel tasks (BLAST searches etc.) 1 thread(s) for OrthoFinder algorithm
Checking required programs are installed
Test can run "mcl -h" - ok Test can run "fastme -i /home/immunoviromics/antiviral_shared/Max/BLASTloopRIG/RIGOutput_e-20_c70/species/OrthoFinder/Results_Nov08/WorkingDirectory/SimpleTest.phy -o /home/immunoviromics/antiviral_shared/Max/BLASTloopRIG/RIGOutput_e-20_c70/species/OrthoFinder/Results_Nov08/WorkingDirectory/SimpleTest.tre" - ok
Dividing up work for BLAST for parallel processing
2020-11-08 12:54:30 : Creating diamond database 1 of 8 2020-11-08 12:54:30 : Creating diamond database 2 of 8 2020-11-08 12:54:30 : Creating diamond database 3 of 8 2020-11-08 12:54:30 : Creating diamond database 4 of 8 2020-11-08 12:54:30 : Creating diamond database 5 of 8 2020-11-08 12:54:30 : Creating diamond database 6 of 8 2020-11-08 12:54:30 : Creating diamond database 7 of 8 2020-11-08 12:54:30 : Creating diamond database 8 of 8
Running diamond all-versus-all
Using 40 thread(s) 2020-11-08 12:54:30 : This may take some time.... 2020-11-08 12:54:30 : Done 0 of 64 2020-11-08 12:54:30 : Done 10 of 64 2020-11-08 12:54:30 : Done 20 of 64 2020-11-08 12:54:31 : Done all-versus-all sequence search
Running OrthoFinder algorithm
2020-11-08 12:54:31 : Initial processing of each species 2020-11-08 12:54:31 : Initial processing of species 0 complete 2020-11-08 12:54:31 : Initial processing of species 1 complete 2020-11-08 12:54:32 : Initial processing of species 2 complete 2020-11-08 12:54:32 : Initial processing of species 3 complete 2020-11-08 12:54:32 : Initial processing of species 4 complete WARNING: Too few hits between species 5 and species 5 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 5 and species 6 to normalise the scores, these hits will be ignored 2020-11-08 12:54:32 : Initial processing of species 5 complete WARNING: Too few hits between species 6 and species 5 to normalise the scores, these hits will be ignored WARNING: Too few hits between species 6 and species 6 to normalise the scores, these hits will be ignored 2020-11-08 12:54:32 : Initial processing of species 6 complete 2020-11-08 12:54:32 : Initial processing of species 7 complete 2020-11-08 12:54:34 : Connected putative homologues 2020-11-08 12:54:34 : Written final scores for species 0 to graph file 2020-11-08 12:54:34 : Written final scores for species 1 to graph file 2020-11-08 12:54:34 : Written final scores for species 2 to graph file 2020-11-08 12:54:34 : Written final scores for species 3 to graph file 2020-11-08 12:54:34 : Written final scores for species 4 to graph file 2020-11-08 12:54:34 : Written final scores for species 5 to graph file 2020-11-08 12:54:34 : Written final scores for species 6 to graph file 2020-11-08 12:54:34 : Written final scores for species 7 to graph file 2020-11-08 12:54:34 : Ran MCL
Writing orthogroups to file
OrthoFinder assigned 46 genes (95.8% of total) to 6 orthogroups. Fifty percent of all genes were in orthogroups with 11 or more genes (G50 was 11) and were contained in the largest 2 orthogroups (O50 was 2). There were 0 orthogroups with all species present and 0 of these consisted entirely of single-copy genes.
2020-11-08 12:54:34 : Done orthogroups
Analysing Orthogroups
Calculating gene distances
Exception RuntimeError: RuntimeError('cannot join current thread',) in <Finalize object, dead> ignored 2020-11-08 12:54:36 : Done Using fallback species tree inference method /tmp/_MEI0m2Vfc/numpy/core/fromnumeric.py:2920: RuntimeWarning: Mean of empty slice. /tmp/_MEI0m2Vfc/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars
Inferring gene and species trees
Best outgroup(s) for species tree
2020-11-08 12:54:38 : Starting STRIDE Traceback (most recent call last): File "orthofinder.py", line 7, in
File "scripts_of/main.py", line 1733, in main
File "scripts_of/main.py", line 1513, in GetOrthologues
File "scripts_of/orthologues.py", line 1004, in OrthologuesWorkflow
File "scripts_of/stride.py", line 509, in GetRoot
File "scripts_of/tree.py", line 221, in init
File "scripts_of/newick.py", line 216, in read_newick
scripts_of.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
[1688272] Failed to execute script orthofinder
Much thanks, Max