Open shzadiqbal opened 3 years ago
Hi
It looks like you probably didn't include all the genes from your species which caused some data OrthoFinder relies on to be missing.
Best wishes David
Hi David, I am running into a similar issue that I haven't encountered before with previous orthofinder analyses ... and it isn't really clear to me what you mean by "you probably didn't include all the genes from your species". Orthofinder takes as input an arbitrary set of protein fastas from different species/genome annotations without any prior info as to their completeness and what genes they include (or don't include), correct? And, by definition, if some proteins originate from lineage-specific de novo genes, then by definition there will be a set of input files that don't contain proteins translated from those genes ... which means Orthofinder would throw exceptions every time such genes were encountered?
Hi!
I got this exact error and I don't know what to do about it. I have version 2.5.2 installed, and I never got this error before when using OrthoFinder in the same way as I did today.
Any help will be extremely valuable :)
Best,
Andrea
Hi @adamfreedman, no that's not correct, OrthoFinder expects the proteomes to be complete. This is what it states in the documentation, in practice what it actually means is that OrthoFinder needs to be able to identify sufficiently many hits between each species in order to be able to successfully carry out its analysis. For the clustering stage that means enough hits to be able to model the amount of sequence divergence between each species pair for different gene lengths seen. For species tree inference it means being able to find enough genes across all species in order to be able to infer a species tree, which is required for subsequent analysis.
Hi @garmonan, would you be able to post the complete output that orthofinder produced? Could you also describe the input you provided. As some of the questions above are about the number of genes provided, probably the most important info is the number of species and the approximate minimum & average number of genes per species.
... 2021-03-13 16:16:44 : Written final scores for species 117 to graph file 2021-03-13 16:16:45 : Written final scores for species 110 to graph file 2021-03-13 16:16:45 : Written final scores for species 118 to graph file 2021-03-13 16:16:46 : Written final scores for species 119 to graph file 2021-03-13 16:16:46 : Written final scores for species 111 to graph file
WARNING: program called by OrthoFinder produced output to stderr
Command: mcl /home/mubashir/shzad/azam/pepfiles/OrthoFinder/Results_Mar13/WorkingDirectory/OrthoFinder_graph.txt -I 1.5 -o /home/mubashir/shzad/azam/pepfiles/OrthoFinder/Results_Mar13/WorkingDirectory/clusters_OrthoFinder_I1.5.txt -te 4 -V all
stdout
b'' stderr
b'[mcl] cut <4> instances of overlap\n[mcl] added <6> garbage entries\n' 2021-03-13 16:16:54 : Ran MCL
Writing orthogroups to file
OrthoFinder assigned 40499 genes (91.6% of total) to 9111 orthogroups. Fifty percent of all genes were in orthogroups with 6 or more genes (G50 was 6) and were contained in the largest 1810 orthogroups (O50 was 1810). There were 0 orthogroups with all species present and 0 of these consisted entirely of single-copy genes.
2021-03-13 16:17:01 : Done orthogroups
Analysing Orthogroups
Calculating gene distances
2021-03-13 16:17:33 : Done Using fallback species tree inference method /home/mubashir/miniconda2/envs/denovo/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3419: RuntimeWarning: Mean of empty slice. return _methods._mean(a, axis=axis, dtype=dtype, /home/mubashir/miniconda2/envs/denovo/lib/python3.8/site-packages/numpy/core/_methods.py:188: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount)
Inferring gene and species trees
2021-03-13 16:17:42 : Done 0 of 2972 2021-03-13 16:17:42 : Done 1000 of 2972 2021-03-13 16:17:43 : Done 2000 of 2972
Best outgroup(s) for species tree
2021-03-13 16:17:44 : Starting STRIDE Traceback (most recent call last): File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/stride.py", line 506, in GetRoot speciesTree = tree.Tree(speciesTreeFN, format=2) File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/tree.py", line 221, in init read_newick(newick, root_node = self, format=format) File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/newick.py", line 216, in read_newick raise NewickError('Unexisting tree file or Malformed newick tree structure.') scripts_of.newick.NewickError: Unexisting tree file or Malformed newick tree structure.
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "OrthoFinder_source/orthofinder.py", line 7, in
main(args)
File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/main.py", line 1765, in main
GetOrthologues(speciesInfoObj, options, prog_caller)
File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/main.py", line 1527, in GetOrthologues
orthologues.OrthologuesWorkflow(speciesInfoObj.speciesToUse,
File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/orthologues.py", line 1039, in OrthologuesWorkflow
roots, clusterscounter, rootedSpeciesTreeFN, nSupport, , _, stride_dups = stride.GetRoot(spTreeFN_ids, files.FileHandler.GetOGsTreeDir(), stride.GeneToSpecies_dash, nHighParallel, qWriteRootedTree=True)
File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/stride.py", line 509, in GetRoot
speciesTree = tree.Tree(speciesTreeFN, format=1)
File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/tree.py", line 221, in init
read_newick(newick, root_node = self, format=format)
File "/home/mubashir/shzad/azam/OrthoFinder_source/scripts_of/newick.py", line 216, in read_newick
raise NewickError('Unexisting tree file or Malformed newick tree structure.')
scripts_of.newick.NewickError: Unexisting tree file or Malformed newick tree structure.