jtlovell / GENESPACE

Other
189 stars 27 forks source link

Step 3 of run_genespace: Item 1 of input is not a data.frame, data.table or list #70

Closed KaneLab closed 1 year ago

KaneLab commented 1 year ago

Hi John. My Genespace run failed at Step 3 of run_genespace. I'm testing it on two published, chromosome-scale genomes I pulled from NCBI. Thanks a lot in advance.

> out <- run_genespace(gpar, overwrite = T)

############################
1. Running orthofinder (or parsing existing results)
    Checking for existing orthofinder results ...
        Copying files over to the temporary directory:
                /home/kyle/Epi_vs_Dend/tmp
        Running the following command in the shell: `orthofinder -f
                /home/kyle/Epi_vs_Dend/tmp -t 12 -a 1 -X -o
                /home/kyle/Epi_vs_Dend/orthofinder`.This can take a while. To
                check the progress, look in the `WorkingDirectory` in the output
                (-o) directory

    OrthoFinder version 2.5.4 Copyright (C) 2014 David Emms

    2023-03-06 14:42:29 : Starting OrthoFinder 2.5.4
    12 thread(s) for highly parallel tasks (BLAST searches etc.)
    1 thread(s) for OrthoFinder algorithm

    Checking required programs are installed
    ----------------------------------------
    Test can run "mcl -h" - ok
    Test can run "fastme -i /home/kyle/Epi_vs_Dend/orthofinder/Results_Mar06/WorkingDirectory/SimpleTest.phy -o /home/kyle/Epi_vs_Dend/orthofinder/Results_Mar06/WorkingDirectory/SimpleTest.tre" - ok

    Dividing up work for BLAST for parallel processing
    --------------------------------------------------
    2023-03-06 14:42:30 : Creating diamond database 1 of 2
    2023-03-06 14:42:30 : Creating diamond database 2 of 2

    Running diamond all-versus-all
    ------------------------------
    Using 12 thread(s)
    2023-03-06 14:42:30 : This may take some time....
    2023-03-06 14:49:02 : Done all-versus-all sequence search

    Running OrthoFinder algorithm
    -----------------------------
    2023-03-06 14:49:03 : Initial processing of each species
    2023-03-06 14:49:18 : Initial processing of species 0 complete
    2023-03-06 14:49:31 : Initial processing of species 1 complete
    2023-03-06 14:49:35 : Connected putative homologues
    2023-03-06 14:49:38 : Written final scores for species 0 to graph file
    2023-03-06 14:49:42 : Written final scores for species 1 to graph file
    2023-03-06 14:49:54 : Ran MCL

    Writing orthogroups to file
    ---------------------------
    OrthoFinder assigned 44638 genes (76.7% of total) to 14601 orthogroups. Fifty percent of all genes were in orthogroups with 2 or more genes (G50 was 2) and were contained in the largest 6838 orthogroups (O50 was 6838). There were 12112 orthogroups with all species present and 6902 of these consisted entirely of single-copy genes.

    2023-03-06 14:49:59 : Done orthogroups

    Analysing Orthogroups
    =====================

    Calculating gene distances
    --------------------------
    2023-03-06 14:50:38 : Done
    2023-03-06 14:50:39 : Done 0 of 2618
    2023-03-06 14:50:40 : Done 1000 of 2618
    2023-03-06 14:50:41 : Done 2000 of 2618

    Inferring gene and species trees
    --------------------------------

    Reconciling gene trees and species tree
    ---------------------------------------
    2023-03-06 14:50:56 : Starting Recon and orthologues
    2023-03-06 14:50:56 : Starting OF Orthologues
    2023-03-06 14:50:57 : Done 0 of 2618
    2023-03-06 14:51:01 : Done 1000 of 2618
    2023-03-06 14:51:03 : Done 2000 of 2618
    2023-03-06 14:51:04 : Done OF Orthologues

    Writing results files
    =====================
    2023-03-06 14:51:06 : Done orthologues

    Results:
        /home/kyle/Epi_vs_Dend/orthofinder/Results_Mar06/

    CITATION:
     When publishing work that uses OrthoFinder please cite:
     Emms D.M. & Kelly S. (2019), Genome Biology 20:238

     If you use the species tree in your work then please also cite:
     Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278
     Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914
############################
2. Combining and annotating bed files w/ OGs and tandem array info ...
    ##############
    Flagging chrs. w/ < 10 unique orthogroups
    ...Dendrobium:  121 genes on  23 small chrs. 
    ...Vanilla   : 1477 genes on 415 small chrs. ***
        NOTE! Genomes flagged *** have > 5% of genes on small chrs. These are
                likely not great assemblies and should be examined carefully
    ##############
    Flagging over-dispered OGs
    ...Dendrobium: 3979 genes in 71 OGs hit > 8 unique places ***
    ...Vanilla   :  173 genes in  9 OGs hit > 8 unique places 
        NOTE! Genomes flagged *** have > 5% of genes in over-dispersed
                orthogroups. These are likely not great annotations, or the
                synteny run contains un-specified WGDs. Regardless, these should
                be examined carefully
    ##############
    Annotation summaries (after exclusions):
    ...Dendrobium: 25381 genes in 19951 OGs || 4503 genes in 1743 arrays
    ...Vanilla   : 27565 genes in 20264 OGs || 8420 genes in 3564 arrays

############################
3. Combining and annotating the blast files with orthogroup info ...
    # Chunk 1 / 1 (02:51:15 PM) ... 
Error in rbindlist(mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { : 
  Item 1 of input is not a data.frame, data.table or list
In addition: Warning message:
In mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { :
  all scheduled cores encountered errors in user code
jtlovell commented 1 year ago

Can you email me the urls to these genomes? Thanks!

jtlovell commented 1 year ago

OK - this was a simple fix. I pushed a patch to v1.1.4 at master. Just install from the master, not the release.

gubrins commented 1 year ago

what do you mean by that? I have the same problem