Closed jdobry-lab closed 1 year ago
what version of genespace is this? it looks like maybe orthofinder didn't run correctly (did you get a full run)?
Here is the output: The only thing I noticed was under Reconciling gene trees and species tree there is a ValueError: invalid mode: 'rU'
Checking dependencies ...
Found valid path to OrthoFinder v2.54: orthofinder
Found valid path to DIAMOND2 v2.16: diamond
Found valid MCScanX_h executable: /Users/jasondobry/MCScanX-master/MCScanX_h
OrthoFinder version 2.5.4 Copyright (C) 2014 David Emms
2023-04-03 14:10:12 : Starting OrthoFinder 2.5.4
4 thread(s) for highly parallel tasks (BLAST searches etc.)
1 thread(s) for OrthoFinder algorithm
Checking required programs are installed
----------------------------------------
Test can run "mcl -h" - ok
Test can run "fastme -i /test/orthofinder/Results_Apr03/WorkingDirectory/SimpleTest.phy -o /test/orthofinder/Results_Apr03/WorkingDirectory/SimpleTest.tre" - ok
Dividing up work for BLAST for parallel processing
--------------------------------------------------
2023-04-03 14:10:15 : Creating diamond database 1 of 2
2023-04-03 14:10:16 : Creating diamond database 2 of 2
Running diamond all-versus-all
------------------------------
Using 4 thread(s)
2023-04-03 14:10:16 : This may take some time....
2023-04-03 14:10:16 : Done 0 of 4
2023-04-03 14:15:12 : Done all-versus-all sequence search
Running OrthoFinder algorithm
-----------------------------
2023-04-03 14:15:12 : Initial processing of each species
2023-04-03 14:15:15 : Initial processing of species 0 complete
2023-04-03 14:15:18 : Initial processing of species 1 complete
2023-04-03 14:15:21 : Connected putative homologues
2023-04-03 14:15:22 : Written final scores for species 0 to graph file
2023-04-03 14:15:22 : Written final scores for species 1 to graph file
2023-04-03 14:15:27 : Ran MCL
Writing orthogroups to file
---------------------------
OrthoFinder assigned 35168 genes (90.8% of total) to 13626 orthogroups. Fifty percent of all genes were in orthogroups with 2 or more genes (G50 was 2) and were contained in the largest 5722 orthogroups (O50 was 5722). There were 13007 orthogroups with all species present and 10817 of these consisted entirely of single-copy genes.
2023-04-03 14:15:32 : Done orthogroups
Analysing Orthogroups
=====================
Calculating gene distances
--------------------------
2023-04-03 14:15:39 : Done
2023-04-03 14:15:40 : Done 0 of 1127
2023-04-03 14:15:41 : Done 100 of 1127
2023-04-03 14:15:41 : Done 200 of 1127
2023-04-03 14:15:42 : Done 300 of 1127
2023-04-03 14:15:42 : Done 400 of 1127
2023-04-03 14:15:43 : Done 500 of 1127
2023-04-03 14:15:44 : Done 600 of 1127
2023-04-03 14:15:44 : Done 700 of 1127
2023-04-03 14:15:45 : Done 800 of 1127
2023-04-03 14:15:46 : Done 900 of 1127
2023-04-03 14:15:46 : Done 1000 of 1127
2023-04-03 14:15:47 : Done 1100 of 1127
Inferring gene and species trees
--------------------------------
Reconciling gene trees and species tree
---------------------------------------
2023-04-03 14:15:48 : Starting Recon and orthologues
2023-04-03 14:15:48 : Starting OF Orthologues
Traceback (most recent call last):
File "/anaconda3/envs/orthofinder/bin/orthofinder", line 7, in <module>
main(args)
File "/anaconda3/envs/orthofinder/bin/scripts_of/__main__.py", line 1778, in main
GetOrthologues(speciesInfoObj, options, prog_caller)
File "/anaconda3/envs/orthofinder/bin/scripts_of/__main__.py", line 1540, in GetOrthologues
orthologues.OrthologuesWorkflow(speciesInfoObj.speciesToUse,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/anaconda3/envs/orthofinder/bin/scripts_of/orthologues.py", line 1090, in OrthologuesWorkflow
ReconciliationAndOrthologues(recon_method, db.ogSet, nHighParallel, nLowParallel, i if qMultiple else None, stride_dups=stride_dups, q_split_para_clades=q_split_para_clades)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/anaconda3/envs/orthofinder/bin/scripts_of/orthologues.py", line 856, in ReconciliationAndOrthologues
species_tree_rooted_labelled = tree.Tree(speciesTree_ids_fn)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/anaconda3/envs/orthofinder/bin/scripts_of/tree.py", line 221, in __init__
read_newick(newick, root_node = self, format=format)
File "/anaconda3/envs/orthofinder/bin/scripts_of/newick.py", line 208, in read_newick
nw = open(newick, 'rU').read()
^^^^^^^^^^^^^^^^^^
ValueError: invalid mode: 'rU'
############################
############################
...human v. human: total hits = 220600, same og = 67513 ...chicken v. chicken: total hits = 190677, same og = 59986 ...human v. chicken: total hits = 224855, same og = 18797 ############## Generating dotplots for all hits ... Done!
############################
...human v. chicken: 14776 hits (11494 anchors) in 590 blocks (509 SVs, 353 regions) ...human v. human: 58372 hits (20571 anchors) in 30 blocks (0 SVs, 0 regions) ...chicken v. chicken: 56904 hits (18084 anchors) in 96 blocks (0 SVs, 0 regions)
############################
############################
############################
############################
Genespace version 1.1.4
Ok got it with version 1.1.8, but am getting the same error reported by another user
Error in match_fasta2gff(path2fasta = fa, path2gff = gf, genespaceWd = genespaceWd, : some of the peptides have '.' or '-' in the sequence. Orthofinder can't handle this.
Yes - the v1.1.8 parse_annotations
bug will be fixed in the next release. Today probably.
But you don't need to re-run parse_annotations
... just use your existing wd with init_genespace
and it should run through no prob.
The bug you reported for v1.1.4 is also known (#77) and caused by the orthologs step of orthofinder failing (which is why you get that traceback error in orthofinder). I have no idea why orthofinder fails there, but it sometimes does happen. GENESPACE v1.1.7+ can handle an incomplete orthofinder run.
I'll close this once v1.1.9 is posted with the parse_annotations
bug fix.
The updated package is now at master. Let me know if it works for you. Update via:
detach("package:GENESPACE", unload = TRUE)
devtools::install_github("jtlovell/GENESPACE", upgrade = F)
library(GENESPACE)
v1.1.10 is pushed to master and built as the latest release. I'm gonna close this issue, since the new release should address it. If this isn't the case, please re-open. Thanks!
Thank you I haven't had a chance to test it yet. If I have further issue I will let you know. Cheers!
Hi John,
I have everything working fine until the pan-gene sets. I got the following errors with the sample data.
Error in setnames(ogs, c("pgRepID", "ofID")) : Can't assign 2 names to a 0 column data.table In addition: Warning message: In system2(path2orthofinder, ofComm, stdout = TRUE, stderr = TRUE) : running command ''orthofinder' -f /test/tmp -t 4 -a 1 -X -o /test/orthofinder 2>&1' had status 1