Closed jkreinz closed 1 year ago
Have you tried re-starting your conda env and R session? This is a system issue related to your permissions I think.
Thanks - I've tried with no luck. Do you have any suggestions of what type of systems permissions GENESPACE would need?
you need read and write to the working directory. but I'm not sure that this is the problem. Assuming you are working on a remote cluster? Perhaps try to do a couple genomes locally on your machine and make sure everything is OK there? For the most part, you don't need big compute for GENESPACE (10 mammal genomes run on my mac in about an hour).
OK thanks, not sure what was happening on remote server but that error is no longer a problem when I run it on my local computer, however I'm getting the below error on step 3. I saw a similar error here that seems like it was patched in version 1.1.4 ... I used devtools::install_github("jtlovell/GENESPACE")
to install genespace today.
3. Combining and annotating the blast files with orthogroup info ...
# Chunk 1 / 2 (17:34:26) ...
...Atub_193_hap2 v. Atub_Nune5_hap1: total hits = 730505, same og = 27096
...Atub_193_hap2 v. Atub_Nune5_hap2: total hits = 416984, same og = 25153
...Atub_193_hap1 v. Atub_193_hap1: total hits = 271904, same og = 56421
...Atub_193_hap1 v. Atub_Nune5_hap2: total hits = 319172, same og = 27186
...Atub_Nune5_hap1 v. Atub_193_hap1: total hits = 322136, same og = 25809
...Atub_Nune5_hap1 v. Atub_Nune5_hap2: total hits = 322931, same og = 22559
# Chunk 2 / 2 (17:34:43) ...
Error in rbindlist(mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { :
Item 1 of input is not a data.frame, data.table or list
In addition: Warning message:
In mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { :
all scheduled cores encountered errors in user code
hmm. this error should not occur, but I have no idea what is causing it. Mind sharing the /bed and /peptide directories so I can troubleshoot? If so, send me an email jlovell[at]hudsonalpha[dot]org
sent! thanks!
Are you getting decent looking blast hits for /orthofinder/Results_XXX/WorkingDirectory/Blast2_2.txt.gz
?
That genome "2: Atub_Nune5_hap1.fa" has a pretty strange looking annotation. There are 7,192 genes that are identical to another gene. Species 1 "Atub_193_hap2.fa" is also potentially problematic with 4,230 genes with a duplicate. Sometimes we get duplicates (rDNA, r-genes, etc.), but its not very common (for example, the other two genomes have 139 and 238 duplicates. ). Duplicate gene models can also be an indicator of miss-phased haplotypes in an outbred genome where one hap gets a bunch of genes that should be on the other.
My guess is that orthofinder broke in some way with genome 2 and / or 3, and GENESPACE didn't see it. As of now, diamond2 has been running on the self hits for genome 2 for 1.5h ... hopefully it gets through it and I can get a full orthofinder run. That way I can figure out what the problem is and write an informative error message. However, if it doesn't happen, you may try figuring out what went wrong with the annotation (e.g. annotation methods, gff parsing etc.) and try again with fewer duplicate gene models.
I'll update here when orthofinder finishes (or crashes after 24h).
diamond2 --more-sensitive
via OrthoFinder
ended up taking almost 4h to complete self blast hits for genome2 (most of the others took 8-12min), but it did complete and the genespace run finished normally after.
Hi John,
Excited to use GENESPACE! I'm running into segfault issues upon initialization of the run that subsequently leads R to crash - any thoughts?
Thanks! Julia