davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
652 stars 185 forks source link

Error: during Running Orthologue Prediction #31

Closed qiao-xin closed 7 years ago

qiao-xin commented 7 years ago

Running Orthologue Prediction

1. Checking required programs are installed

Test can run "fastme -i myfilepath/SimpleTest.phy -o myfilepath/SimpleTest.tre" - ok Test can run "dlcpar_search --version" - ok

2. Calculating gene distances

2016-09-15 09:39:04 : Done 20 of 49 2016-09-15 09:38:40 : Done 0 of 49 2016-09-15 09:38:53 : Done 10 of 49 2016-09-15 09:39:15 : Done 30 of 49 2016-09-15 09:39:29 : Processing species 0 2016-09-15 09:40:08 : Processing species 1 2016-09-15 09:40:43 : Processing species 2 2016-09-15 09:41:24 : Processing species 3 2016-09-15 09:43:03 : Processing species 4 2016-09-15 09:43:33 : Processing species 5 2016-09-15 09:44:03 : Processing species 6

3. Inferring gene and species trees

. Error: Invalid distance matrix : numerical value expected for taxon '5_6492' instead of '-inf'. 2016-09-15 09:45:33 : Done 4000 of 10413 2016-09-15 09:45:13 : Done 1000 of 10413

4. Best outgroup(s) for species tree

Traceback (most recent call last): File "orthofinder.py", line 1190, in orthologuesResultsFilesString = get_orthologues.GetOrthologues(workingDir, resultsDir, clustersFilename_pairs, nBlast) File "/home/george/programs/OrthoFinder-master/orthofinder/scripts/get_orthologues.py", line 566, in GetOrthologues roots, clusters, rootedSpeciesTreeFN, nSupport = rfd.GetRoot(spTreeFN_ids, os.path.split(db.treesPatIDs)[0] + "/", rfd.GeneToSpecies_dash, nProcesses, treeFmt = 1) File "/home/george/programs/OrthoFinder-master/orthofinder/scripts/root_from_duplications.py", line 401, in GetRoot list_of_lists = pool.map(SupportedHierachies_wrapper2, [(fn, GeneToSpeciesMap, species, dict_clades, clade_names) for fn in glob.glob(treesDir + "/*")]) File "/home/george/anaconda2/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/home/george/anaconda2/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value scripts.newick.NewickError: Unexisting tree file or Malformed newick tree structure. 2016-09-15 09:35:32 : Writen final scores for species 0 to graph file 2016-09-15 09:35:43 : Writen final scores for species 1 to graph file 2016-09-15 09:35:54 : Writen final scores for species 2 to graph file 2016-09-15 09:36:17 : Writen final scores for species 3 to graph file 2016-09-15 09:36:27 : Writen final scores for species 4 to graph file 2016-09-15 09:36:36 : Writen final scores for species 5 to graph file 2016-09-15 09:36:45 : Writen final scores for species 6 to graph file

How to handle this error?

mjfi2sb3 commented 7 years ago

Hi, has anyone addressed this error? I am looking at the same error and I cannot tell the seriousness of it

Regards, /SB

davidemms commented 7 years ago

Hi, could you find the "WorkingDirectory" that has the Blast.txt files in it and run "grep 5_6492 Blasttxt" in that directory and paste the output that you get?

Many thanks David

davidemms commented 7 years ago

... and could you also let me know what version of OrthoFinder you are using please? Thanks

qiao-xin commented 7 years ago

@davidemms Many thanks for your kind reply. According to your suggestion, the command was ran, and the results are as follows: Blast3_5.txt:3_49031 5_6492 65.00 60 21 0 3 62 3 62 3e-12 58.2 Blast5_3.txt:5_6492 3_49031 65.00 60 21 0 3 62 3 62 3e-12 58.2 Blast5_5.txt:5_6492 5_6492 100.00 87 0 0 1 87 1 87 2e-41 133

In addition, I can find the directory "Orthologues" in the "WorkingDirectory", however, the "Orthologues" only contained dir "Gene_Trees" and "WorkingDirectory", and this is incomplete.

I downloaded the OrthoFinder on Sep. 14, 2016. This version should be the latest version.

Thanks

qiao-xin commented 7 years ago

@davidemms If you need to check any related data files, I am very pleased to upload it.

davidemms commented 7 years ago

I can't see at the moment want the cause of the error might be. Would you mind sharing the dataset with me if it is not too large? I would need a tar.gz or zip file of the WorkingDirectory (the one where you ran the grep command). Dropbox would be the best way to send the data to me. If this is ok then send me an email and I will send you an invite to a dropbox folder that you can put the compressed dataset in.

Thanks David

davidemms commented 7 years ago

The problem was caused by a rare case where in the BLAST searches a sequence was hit by one sequence but when it was used as the query it didn't hit anything.

mjfi2sb3 commented 7 years ago

so is this problem solved?

do I need to re-run orthofinder or I can just restart from the blast results?

qiao-xin commented 7 years ago

@mjfi2sb3 Yes, this problem has been solved by Davide. You need to download the latest version of the OrthoFinder from GitHub and re-run it.

davidemms commented 7 years ago

You can restart from the BLAST results.

All the best David

CIWa commented 7 years ago

Hello everyone,

During the last week I ran into a similar problem while running version 1.1.4. and while using MSA. My error message was

Best outgroup(s) for species tree

Traceback (most recent call last): File "orthofinder/orthofinder.py", line 1354, in File "orthofinder/orthofinder.py", line 1242, in GetOrthologues File "orthofinder/scripts/get_orthologues.py", line 746, in OrthologuesWorkflow File "orthofinder/scripts/root_from_duplications.py", line 405, in GetRoot File "multiprocessing/pool.py", line 250, in map File "multiprocessing/pool.py", line 554, in get scripts.newick.NewickError: Unexisting tree file or Malformed newick tree structure. Failed to execute script orthofinder 2017-06-30 20:44:28 : Writen final scores for species 3 to graph file 2017-06-30 20:44:28 : Writen final scores for species 4 to graph file 2017-06-30 20:44:30 : Writen final scores for species 9 to graph file 2017-06-30 20:44:30 : Writen final scores for species 5 to graph file 2017-06-30 20:44:44 : Writen final scores for species 0 to graph file 2017-06-30 20:44:47 : Writen final scores for species 8 to graph file 2017-06-30 20:44:50 : Writen final scores for species 10 to graph file 2017-06-30 20:44:58 : Writen final scores for species 2 to graph file 2017-06-30 20:46:23 : Writen final scores for species 6 to graph file

I will try again using version 1.1.8.

Best, Isabel

CIWa commented 7 years ago

Hi everyone,

With exactly the same data version 1.1.8 seems to have no problem.

Best, Isabel

drkmj commented 6 years ago

Hi all,

I have run across this error too. I am using OrthoFinder version 2.1.0.

Any ideas what I can do to resolve this error?

Analysing Orthogroups

Calculating gene distances

2017-10-28 09:24:43 : Done 0 of 25 2017-10-28 09:42:27 : Done 10 of 25 2017-10-28 09:48:35 : Done 20 of 25 2017-10-28 09:50:58 : Processing species 0 2017-10-28 09:53:59 : Processing species 1 2017-10-28 09:54:21 : Processing species 2 2017-10-28 09:54:33 : Processing species 3 2017-10-28 09:54:52 : Processing species 4

Inferring gene and species trees

2017-10-28 09:55:40 : Done 0 of 9161 2017-10-28 09:59:33 : Done 1000 of 9161 2017-10-28 09:59:43 : Done 2000 of 9161 2017-10-28 09:59:52 : Done 3000 of 9161 2017-10-28 10:00:01 : Done 4000 of 9161 2017-10-28 10:00:11 : Done 5000 of 9161 2017-10-28 10:00:20 : Done 6000 of 9161 2017-10-28 10:00:29 : Done 7000 of 9161 2017-10-28 10:00:39 : Done 8000 of 9161 2017-10-28 10:00:48 : Done 9000 of 9161

Best outgroup(s) for species tree

Traceback (most recent call last): File "orthofinder.py", line 1533, in GetOrthologues(dirs, options, program_caller, clustersFilename_pairs, orthogroupsResultsFilesString) File "orthofinder.py", line 1365, in GetOrthologues options.separatePickleDir) File "/Users/kjohnson/OrthoFinder-master/orthofinder/scripts/orthologues.py", line 924, in OrthologuesWorkflow roots, clusterscounter, rootedSpeciesTreeFN, nSupport, , _, all_stride_dup_genes = stride.GetRoot(spTreeFN_ids, os.path.split(db.TreeFilename_IDs(0))[0] + "/", stride.GeneToSpecies_dash, nHighParallel, treeFmt = 1, qWriteRootedTree=True) File "/Users/kjohnson/OrthoFinder-master/orthofinder/scripts/stride.py", line 499, in GetRoot list_of_dicts = pool.map(SupportedHierachies_wrapper2, [(fn, GeneToSpeciesMap, species, dict_clades, clade_names, qWriteDupTrees) for fn in glob.glob(treesDir + "/*")]) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 567, in get raise self._value scripts.newick.NewickError: Unexisting tree file or Malformed newick tree structure.

Thank you, Kevin