Closed noor-albader closed 2 years ago
Thanks for posting - this type of issue is why I am working on V1 (which will do format checks up front for everything). Its super hard to troubleshoot.
First, can you confirm that the orthofinder run was successful ... it should have spit out a really long dialog. You can go into /orthofinder/resultsXX/orthogroups and make sure there is an Orthogroups.tsv file.
If thats there and all good, then please re-run synteny
with nCores = 1. That will give a more informative error.
Hi John
Thank you for your reply! Yes, Orthofinder does run successfully every time I have tried runing the pipeline and an output directroy /orthofinder/resultsXX/orthogroups
is created and there is an Orthogroups.tsv
file.
What I realized is that orthofinder outputs orthogroups for all 22 genomes (including the outgroup (Lp)) but I realized when running synteny/McScan only 18 of the genomes' genes are placed in collinear arrays. The 4 that were excluded were Lp (the outgroup) and OB, OS and OP.
Two things differ with these 4 genomes: (1) Orthofinder did annotated 100% of their genes as orthologs (see step (4) below) (2) These are the 4 genomes that were run with separately with parsing with (see step (2) below)
To demonstrate these points above I re-ran the pipeline for the outputted log and took your suggestion of running with nCores=1
and get the following:
(1) initialising Genespace
> gpar <- init_genespace(
genomeIDs = c("Lp","OAcc","OAdd","OCkk", "OCll", "OGcc", "OGdd", "OLcc", "OLdd","OLhh", "OLjj", "OMALbb", "OMALcc", "OMINbb", "OMINcc", "ORhh", "ORjj", "Os", "OShh", "OSkk", "OP","OB"),
speciesIDs = c("Lp","OAcc","OAdd","OCkk", "OCll", "OGcc", "OGdd", "OLcc", "OLdd","OLhh", "OLjj", "OMALbb", "OMALcc", "OMINbb", "OMINcc", "ORhh", "ORjj", "Os", "OShh", "OSkk", "OP","OB"),
versionIDs = c("Lp","OAcc","OAdd","OCkk", "OCll", "OGcc", "OGdd", "OLcc", "OLdd","OLhh", "OLjj", "OMALbb", "OMALcc", "OMINbb", "OMINcc", "ORhh", "ORjj", "Os", "OShh", "OSkk", "OP","OB"),
outgroup = "Lp",
ploidy = rep(1,22),
diamondMode = "fast",
orthofinderMethod = "default",
wd = runwd,
orthofinderInBlk = TRUE,
overwrite = F,
verbose = T,
nCores = 1,
minPepLen = 50,
gffString = "gff",
pepString = "fa",
path2orthofinder = "/home/albadenm/software/miniconda3/envs/orthofinder/bin/orthofinder",
path2diamond = "diamond",
path2mcscanx = "/home/albadenm/software/MCScanX",
rawGenomeDir = file.path(runwd, "rawGenomes"))
Initializing GENESPACE run
checking genomeIDs ... PASS (Lp, OAcc, OAdd, OCkk, OCll, OGcc, OGdd, OLcc, OLdd, OLhh, OLjj, OMALbb, OMALcc, OMINbb, OMINcc, ORhh, ORjj, Os, OShh, OSkk, OP, OB)
checking outgroup ... PASS (Lp)
checking ploidy ... PASS (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
checking the number of parallel processes ... PASS (1)
Verbosity ... PASS (TRUE)
minPepLen ... PASS (50)
checking working directory ... PASS (/home/albadenm/Polyploid_group_renamed_mainChr)
checking parsed gff files ... PASS (/home/albadenm/Polyploid_group_renamed_mainChr/gff)
checking parsed peptide files ... PASS (/home/albadenm/Polyploid_group_renamed_mainChr/peptide)
Checking dependencies and 3rd party installations
MCScanX installation ... PASS (/home/albadenm/software/MCScanX/MCScanX_h)
Orthofinder installation ... PASS (/home/albadenm/software/miniconda3/envs/orthofinder/bin/orthofinder)
OrthoFinder method ... PASS - (default inside R)
Orthofinder in block method ... PASS (TRUE)
GENESPACE run successfully initialized
(2) Parsing the annotation
parse_annotations(
gsParam = gpar,
genomeIDs = c("Lp","Os","OP","OB"),
gffEntryType = "mRNA",
gffIdColumn = "ID",
gffStripText = "ID=",
headerEntryIndex = 5,
headerSep = " ",
headerStripText = "ID=")
Parsing annotation files ...
Lp ...
Importing gff ... found 38961 gff entires, and 38961 mRNA entries
Importing fasta ... found 38960 fasta entires
38836 gff-peptide matches
Done!
Os ...
Importing gff ... found 44643 gff entires, and 44643 mRNA entries
Importing fasta ... found 42355 fasta entires
41648 gff-peptide matches
Done!
OP ...
Importing gff ... found 41060 gff entires, and 41060 mRNA entries
Importing fasta ... found 41060 fasta entires
40917 gff-peptide matches
Done!
OB ...
Importing gff ... found 31356 gff entires, and 31356 mRNA entries
Importing fasta ... found 32037 fasta entires
31218 gff-peptide matches
Done!
> parse_annotations(
gsParam = gpar,
genomeIDs = c("OAcc","OAdd","OCkk", "OCll", "OGcc", "OGdd", "OLcc", "OLdd","OLhh", "OLjj", "OMALbb", "OMALcc", "OMINbb", "OMINcc", "ORhh", "ORjj", "OShh", "OSkk"),
gffEntryType = "Gene",
gffIdColumn = "ID",
gffStripText = "ID=",
headerEntryIndex = 1,
headerSep = " ",
headerStripText = "ID=")
Parsing annotation files ...
OAcc ...
Importing gff ... found 34936 gff entires, and 34936 Gene entries
Importing fasta ... found 34936 fasta entires
34936 gff-peptide matches
Done!
OAdd ...
Importing gff ... found 31451 gff entires, and 31451 Gene entries
Importing fasta ... found 31451 fasta entires
31451 gff-peptide matches
Done!
OCkk ...
Importing gff ... found 25845 gff entires, and 25845 Gene entries
Importing fasta ... found 25845 fasta entires
25845 gff-peptide matches
Done!
OCll ...
Importing gff ... found 26978 gff entires, and 26978 Gene entries
Importing fasta ... found 26978 fasta entires
26978 gff-peptide matches
Done!
OGcc ...
Importing gff ... found 34778 gff entires, and 34778 Gene entries
Importing fasta ... found 34778 fasta entires
34778 gff-peptide matches
Done!
OGdd ...
Importing gff ... found 31578 gff entires, and 31578 Gene entries
Importing fasta ... found 31578 fasta entires
31578 gff-peptide matches
Done!
OLcc ...
Importing gff ... found 37543 gff entires, and 37543 Gene entries
Importing fasta ... found 37543 fasta entires
37543 gff-peptide matches
Done!
OLdd ...
Importing gff ... found 36191 gff entires, and 36191 Gene entries
Importing fasta ... found 36191 fasta entires
36191 gff-peptide matches
Done!
OLhh ...
Importing gff ... found 38679 gff entires, and 38679 Gene entries
Importing fasta ... found 38679 fasta entires
38679 gff-peptide matches
Done!
OLjj ...
Importing gff ... found 34645 gff entires, and 34645 Gene entries
Importing fasta ... found 34645 fasta entires
34645 gff-peptide matches
Done!
OMALbb ...
Importing gff ... found 37927 gff entires, and 37927 Gene entries
Importing fasta ... found 37927 fasta entires
37927 gff-peptide matches
Done!
OMALcc ...
Importing gff ... found 39942 gff entires, and 39942 Gene entries
Importing fasta ... found 39942 fasta entires
39942 gff-peptide matches
Done!
OMINbb ...
Importing gff ... found 36063 gff entires, and 36063 Gene entries
Importing fasta ... found 36063 fasta entires
36062 gff-peptide matches
Done!
OMINcc ...
Importing gff ... found 38203 gff entires, and 38203 Gene entries
Importing fasta ... found 38203 fasta entires
38203 gff-peptide matches
Done!
ORhh ...
Importing gff ... found 43892 gff entires, and 43892 Gene entries
Importing fasta ... found 43892 fasta entires
43890 gff-peptide matches
Done!
ORjj ...
Importing gff ... found 38916 gff entires, and 38916 Gene entries
Importing fasta ... found 38916 fasta entires
38916 gff-peptide matches
Done!
OShh ...
Importing gff ... found 34293 gff entires, and 34293 Gene entries
Importing fasta ... found 34293 fasta entires
34293 gff-peptide matches
Done!
OSkk ...
Importing gff ... found 36564 gff entires, and 36564 Gene entries
Importing fasta ... found 36564 fasta entires
36563 gff-peptide matches
Done!
(3) Running Orthofinder
> gpar<-run_orthofinder(gsParam=gpar)
Synteny Parameters have not been set! Setting to defaults
Running 'defualt' genespace orthofinder method
############################################################
Cleaning out orthofinder directory and prepping run
Calculating blast results and running OrthoFinder
##################################################
##################################################
OrthoFinder version 2.5.4 Copyright (C) 2014 David Emms
2022-09-16 16:16:28 : Starting OrthoFinder 2.5.4
16 thread(s) for highly parallel tasks (BLAST searches etc.)
1 thread(s) for OrthoFinder algorithm
Checking required programs are installed
----------------------------------------
Test can run "mcl -h" - ok
Test can run "fastme -i /home/albadenm/Polyploid_group_renamed_mainChr/orthofinder/Results_Sep16/WorkingDirectory/SimpleTest.phy -o /home/albadenm/Polyploid_group_renamed_mainChr/orthofinder/Results_Sep16/WorkingDirectory/SimpleTest.tre" - ok
Dividing up work for BLAST for parallel processing
--------------------------------------------------
2022-09-16 16:16:31 : Creating diamond database 1 of 22
2022-09-16 16:16:31 : Creating diamond database 2 of 22
2022-09-16 16:16:32 : Creating diamond database 3 of 22
2022-09-16 16:16:32 : Creating diamond database 4 of 22
2022-09-16 16:16:32 : Creating diamond database 5 of 22
2022-09-16 16:16:32 : Creating diamond database 6 of 22
2022-09-16 16:16:32 : Creating diamond database 7 of 22
2022-09-16 16:16:32 : Creating diamond database 8 of 22
2022-09-16 16:16:33 : Creating diamond database 9 of 22
2022-09-16 16:16:33 : Creating diamond database 10 of 22
2022-09-16 16:16:33 : Creating diamond database 11 of 22
2022-09-16 16:16:33 : Creating diamond database 12 of 22
2022-09-16 16:16:33 : Creating diamond database 13 of 22
2022-09-16 16:16:33 : Creating diamond database 14 of 22
2022-09-16 16:16:34 : Creating diamond database 15 of 22
2022-09-16 16:16:34 : Creating diamond database 16 of 22
2022-09-16 16:16:34 : Creating diamond database 17 of 22
2022-09-16 16:16:34 : Creating diamond database 18 of 22
2022-09-16 16:16:34 : Creating diamond database 19 of 22
2022-09-16 16:16:35 : Creating diamond database 20 of 22
2022-09-16 16:16:35 : Creating diamond database 21 of 22
2022-09-16 16:16:35 : Creating diamond database 22 of 22
Running diamond all-versus-all
------------------------------
Using 16 thread(s)
2022-09-16 16:16:35 : This may take some time....
2022-09-16 16:16:35 : Done 0 of 484
2022-09-16 16:41:16 : Done 100 of 484
2022-09-16 17:03:12 : Done 200 of 484
2022-09-16 17:24:52 : Done 300 of 484
2022-09-16 17:42:37 : Done 400 of 484
2022-09-16 17:54:57 : Done all-versus-all sequence search
Running OrthoFinder algorithm
-----------------------------
2022-09-16 17:54:59 : Initial processing of each species
2022-09-16 17:56:19 : Initial processing of species 0 complete
2022-09-16 17:57:30 : Initial processing of species 1 complete
2022-09-16 17:58:32 : Initial processing of species 2 complete
2022-09-16 17:59:30 : Initial processing of species 3 complete
2022-09-16 18:00:17 : Initial processing of species 4 complete
2022-09-16 18:01:07 : Initial processing of species 5 complete
2022-09-16 18:02:18 : Initial processing of species 6 complete
2022-09-16 18:03:22 : Initial processing of species 7 complete
2022-09-16 18:04:38 : Initial processing of species 8 complete
2022-09-16 18:05:51 : Initial processing of species 9 complete
2022-09-16 18:07:02 : Initial processing of species 10 complete
2022-09-16 18:08:10 : Initial processing of species 11 complete
2022-09-16 18:09:24 : Initial processing of species 12 complete
2022-09-16 18:10:40 : Initial processing of species 13 complete
2022-09-16 18:11:51 : Initial processing of species 14 complete
2022-09-16 18:13:05 : Initial processing of species 15 complete
2022-09-16 18:14:27 : Initial processing of species 16 complete
2022-09-16 18:15:44 : Initial processing of species 17 complete
2022-09-16 18:16:56 : Initial processing of species 18 complete
2022-09-16 18:18:01 : Initial processing of species 19 complete
2022-09-16 18:19:07 : Initial processing of species 20 complete
2022-09-16 18:20:26 : Initial processing of species 21 complete
2022-09-16 18:22:25 : Connected putative homologues
2022-09-16 18:22:38 : Written final scores for species 0 to graph file
2022-09-16 18:22:50 : Written final scores for species 1 to graph file
2022-09-16 18:23:01 : Written final scores for species 2 to graph file
2022-09-16 18:23:11 : Written final scores for species 3 to graph file
2022-09-16 18:23:20 : Written final scores for species 4 to graph file
2022-09-16 18:23:29 : Written final scores for species 5 to graph file
2022-09-16 18:23:40 : Written final scores for species 6 to graph file
2022-09-16 18:23:51 : Written final scores for species 7 to graph file
2022-09-16 18:24:04 : Written final scores for species 8 to graph file
2022-09-16 18:24:16 : Written final scores for species 9 to graph file
2022-09-16 18:24:29 : Written final scores for species 10 to graph file
2022-09-16 18:24:41 : Written final scores for species 11 to graph file
2022-09-16 18:24:53 : Written final scores for species 12 to graph file
2022-09-16 18:25:07 : Written final scores for species 13 to graph file
2022-09-16 18:25:19 : Written final scores for species 14 to graph file
2022-09-16 18:25:32 : Written final scores for species 15 to graph file
2022-09-16 18:25:45 : Written final scores for species 16 to graph file
2022-09-16 18:26:00 : Written final scores for species 17 to graph file
2022-09-16 18:26:13 : Written final scores for species 18 to graph file
2022-09-16 18:26:24 : Written final scores for species 19 to graph file
2022-09-16 18:26:36 : Written final scores for species 20 to graph file
2022-09-16 18:26:50 : Written final scores for species 21 to graph file
WARNING: program called by OrthoFinder produced output to stderr
Command: mcl /home/albadenm/Polyploid_group_renamed_mainChr/orthofinder/Results_Sep16/WorkingDirectory/OrthoFinder_graph.txt -I 1.5 -o /home/albadenm/Polyploid_group_renamed_mainChr/orthofinder/Results_Sep16/WorkingDirectory/clusters_OrthoFinder_I1.5.txt -te 1 -V all
stdout
------
b''
stderr
------
b'[mcl] cut <1> instances of overlap\n'
2022-09-16 18:36:24 : Ran MCL
Writing orthogroups to file
---------------------------
OrthoFinder assigned 738903 genes (93.4% of total) to 52873 orthogroups. Fifty percent of all genes were in orthogroups with 24 or more genes (G50 was 24) and were contained in the largest 9276 orthogroups (O50 was 9276). There were 8469 orthogroups with all species present and 1739 of these consisted entirely of single-copy genes.
2022-09-16 18:36:40 : Done orthogroups
Analysing Orthogroups
=====================
Calculating gene distances
--------------------------
2022-09-16 18:58:37 : Done
2022-09-16 18:58:41 : Done 0 of 30576
2022-09-16 18:58:52 : Done 1000 of 30576
2022-09-16 18:58:53 : Done 2000 of 30576
2022-09-16 18:58:53 : Done 3000 of 30576
2022-09-16 18:58:54 : Done 4000 of 30576
2022-09-16 18:58:55 : Done 5000 of 30576
2022-09-16 18:58:55 : Done 6000 of 30576
2022-09-16 18:58:56 : Done 7000 of 30576
2022-09-16 18:58:56 : Done 8000 of 30576
2022-09-16 18:58:57 : Done 9000 of 30576
2022-09-16 18:58:57 : Done 10000 of 30576
2022-09-16 18:58:58 : Done 11000 of 30576
2022-09-16 18:58:58 : Done 12000 of 30576
2022-09-16 18:58:59 : Done 13000 of 30576
2022-09-16 18:58:59 : Done 14000 of 30576
2022-09-16 18:59:00 : Done 15000 of 30576
2022-09-16 18:59:00 : Done 16000 of 30576
2022-09-16 18:59:01 : Done 17000 of 30576
2022-09-16 18:59:01 : Done 18000 of 30576
2022-09-16 18:59:02 : Done 19000 of 30576
2022-09-16 18:59:02 : Done 20000 of 30576
2022-09-16 18:59:02 : Done 21000 of 30576
2022-09-16 18:59:03 : Done 22000 of 30576
2022-09-16 18:59:03 : Done 23000 of 30576
2022-09-16 18:59:04 : Done 24000 of 30576
2022-09-16 18:59:04 : Done 25000 of 30576
2022-09-16 18:59:05 : Done 26000 of 30576
2022-09-16 18:59:05 : Done 27000 of 30576
2022-09-16 18:59:06 : Done 28000 of 30576
2022-09-16 18:59:06 : Done 29000 of 30576
2022-09-16 18:59:07 : Done 30000 of 30576
Inferring gene and species trees
--------------------------------
8469 trees had all species present and will be used by STAG to infer the species tree
Best outgroup(s) for species tree
---------------------------------
2022-09-16 19:01:16 : Starting STRIDE
2022-09-16 19:01:20 : Done STRIDE
Observed 555 well-supported, non-terminal duplications. 551 support the best root and 4 contradict it.
Best outgroup for species tree:
OB
Reconciling gene trees and species tree
---------------------------------------
Outgroup: OB
2022-09-16 19:01:20 : Starting Recon and orthologues
2022-09-16 19:01:20 : Starting OF Orthologues
2022-09-16 19:01:22 : Done 0 of 30576
2022-09-16 19:01:55 : Done 1000 of 30576
2022-09-16 19:02:12 : Done 2000 of 30576
2022-09-16 19:02:26 : Done 3000 of 30576
2022-09-16 19:02:38 : Done 4000 of 30576
2022-09-16 19:02:49 : Done 5000 of 30576
2022-09-16 19:02:57 : Done 6000 of 30576
2022-09-16 19:03:05 : Done 7000 of 30576
2022-09-16 19:03:13 : Done 8000 of 30576
2022-09-16 19:03:21 : Done 9000 of 30576
2022-09-16 19:03:29 : Done 10000 of 30576
2022-09-16 19:03:36 : Done 11000 of 30576
2022-09-16 19:03:43 : Done 12000 of 30576
2022-09-16 19:03:50 : Done 13000 of 30576
2022-09-16 19:03:58 : Done 14000 of 30576
2022-09-16 19:04:05 : Done 15000 of 30576
2022-09-16 19:04:11 : Done 16000 of 30576
2022-09-16 19:04:18 : Done 17000 of 30576
2022-09-16 19:04:24 : Done 18000 of 30576
2022-09-16 19:04:30 : Done 19000 of 30576
2022-09-16 19:04:35 : Done 20000 of 30576
2022-09-16 19:04:38 : Done 21000 of 30576
2022-09-16 19:04:41 : Done 22000 of 30576
2022-09-16 19:04:44 : Done 23000 of 30576
2022-09-16 19:04:45 : Done 24000 of 30576
2022-09-16 19:04:47 : Done 25000 of 30576
2022-09-16 19:04:49 : Done 26000 of 30576
2022-09-16 19:04:50 : Done 27000 of 30576
2022-09-16 19:04:52 : Done 28000 of 30576
2022-09-16 19:04:53 : Done 29000 of 30576
2022-09-16 19:04:54 : Done 30000 of 30576
2022-09-16 19:04:55 : Done OF Orthologues
Writing results files
=====================
2022-09-16 19:04:58 : Done orthologues
Results:
/home/albadenm/Polyploid_group_renamed_mainChr/orthofinder/Results_Sep16/
CITATION:
When publishing work that uses OrthoFinder please cite:
Emms D.M. & Kelly S. (2019), Genome Biology 20:238
If you use the species tree in your work then please also cite:
Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278
Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914>
(4) Running McScan
gpar <- synteny(gsParam = gpar)
Synteny Parameters have not been set! Setting to defaults
Indexing location of orthofinder results ... Done!
Parsing the gff files ...
Reading the gffs and adding orthofinder IDs ... Done!
Found 203172 global OGs for 752971 genes
QC-ing genome to ensure chromosomes/scaffolds are big enough...
Genome: n. chrs PASS/FAIL, n. genes PASS/FAIL, n. OGs PASS/FAIL
OAcc: 12/0, 34936/0, 30125/0
OAdd: 12/0, 31451/0, 27277/0
OB: 53/156, 31597/295, 31597/295
OCkk: 12/0, 25845/0, 23464/0
OCll: 12/0, 26978/0, 24273/0
OGcc: 12/0, 34778/0, 29934/0
OGdd: 12/0, 31578/0, 27247/0
OLcc: 12/0, 37543/0, 30439/0
OLdd: 12/0, 36191/0, 29831/0
OLhh: 12/0, 38679/0, 32749/0
OLjj: 12/0, 34645/0, 29611/0
OMALbb: 12/0, 37927/0, 32155/0
OMALcc: 12/0, 39942/0, 33339/0
OMINbb: 12/0, 36062/0, 30710/0
OMINcc: 12/0, 38203/0, 31793/0
OP: 12/0, 40917/0, 40917/0
ORhh: 12/0, 43890/0, 36776/0
ORjj: 12/0, 38916/0, 32907/0
OShh: 12/0, 34293/0, 29067/0
OSkk: 12/0, 36563/0, 31044/0
Os: 14/0, 41742/0, 41742/0
All look good!
Defining collinear orthogroup arrays ...
Found the following counts of arrays / genome:
OAcc: 6276 genes in 2419 collinear arrays
OAdd: 5455 genes in 2118 collinear arrays
OCkk: 3497 genes in 1535 collinear arrays
OCll: 4055 genes in 1743 collinear arrays
OGcc: 6127 genes in 2407 collinear arrays
OGdd: 5437 genes in 2106 collinear arrays
OLcc: 7862 genes in 2948 collinear arrays
OLdd: 6961 genes in 2625 collinear arrays
OLhh: 6941 genes in 2712 collinear arrays
OLjj: 6469 genes in 2534 collinear arrays
OMALbb: 7113 genes in 2741 collinear arrays
OMALcc: 7942 genes in 3049 collinear arrays
OMINbb: 6588 genes in 2561 collinear arrays
OMINcc: 7463 genes in 2864 collinear arrays
ORhh: 7664 genes in 3014 collinear arrays
ORjj: 7358 genes in 2812 collinear arrays
OShh: 6944 genes in 2585 collinear arrays
OSkk: 6552 genes in 2614 collinear arrays
Pulling synteny for 206 unique pairwise combinations of genomes
Running 206 chunks of up to 1 combinations each:
Chunk 1 / 206 (07:12:58 PM) ... Error in `[.data.table`(a, , `:=`(scrRank1, 1:.N), by = "ofID1") :
Supplied 2 items to be assigned to group 1 of size 0 in column 'scrRank1'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.
huh ... everything looks good here. That particular error you have means that genespace can't find the blast file. It for sure can find the orthogroups.tsv file (thats how it can make the collinear arrays). I have seen this error before, but only when there are multiple orthofinder runs in the /orthofinder directory, or the user added a subsequent genome to a previous orthofinder run. Did you do either of those? If not, shoot me an email and we will get this figured out. In the future, GENESPACE v1 will catch these issues up front and make troubleshooting easier. But that is about a month out and I'd like to help you get this figured out before then.
Hi, I'm wondering if this issue ever got resolved because I am encountering an error at the same spot in the synteny function and assume it's probably the same problem.
The issue above was due to a mismatch in genomeIDs and those used in the orthofinder run, I think caused by running orthofinder multiple times with different genomeIDs. I would recommend cleaning out your orthofinder directory and re-running from scratch. If that doesn't fix it, let me know.
I deleted all the folders and reran it from scratch and still encountered the same issue. It makes it to chunk 3/48 before erroring our with the below message. The idea of IDs not matching seems plausible, but when I check my gIDs with the species listed in some of the orthofinder output files, they all match up exactly. "Error: $ operator is invalid for atomic vectors In addition: Warning message: In mclapply(1:nrow(splSynp[[i]]), mc.cores = nCores, function(j) { : scheduled core 1 encountered error in user code, all values of the job will be affected"
After some trial and error, it seems that certain species are breaking it and it successfully runs on some smaller subsets of more model organisms. However for all species I have the RefSeq translated_cds fasta and genomic.gff
OK - I know the problem and I have a quick solution - the problematic genomes have special characters that orthofinder strips out internally. These are at least:'|' and ':'. So, genes that came in with those symbols in their names came out with a different name and couldn't be merged into the combined bed file.
v1.1.3 will be pushed to /dev ASAP and will contain a fix for this. Basically, the solution is to replace all special characters with '_'. This isn't the best solution, but it is the only one I can implement right now.
Hello, I was able to run the pipeline smoothly with the example data and a subset (2 genomes) of my data without error.
However, when running with 20+ genomes I obtained an error I have not be able to figure out.
Loading and parsing the data, as well as running orthofinder ran without a glitch but now (without changing any synteny parameters/running default parameters) I get the following error:
This part ran fine:
BUT running the synteny function I get this error:
I am not sure, but is this error is due to the wrapper's scheduler perhaps? If not, do you have an idea where this error (that occurs only with larger sets of data) is originating from?