jtlovell / GENESPACE

Other
191 stars 27 forks source link

Reporting a issue for data.table #97

Closed LQHHHHH closed 1 year ago

LQHHHHH commented 1 year ago

Hi,

Recently, I encountered an issue while running the latest version of GENESPACE, which resulted in a caught segfault and memory not mapped error. I discovered that the data.table package, which is used in the GENESPACE package, causes these issues. I'm posting my solution here in case anyone else encounters the same issue. I set data.table to use a single core instead of all available cores:

library("data.table")
setDTthreads(1)

This works for me.

Best, Qionghou

jtlovell commented 1 year ago

Thanks - which function returned this particular error? GENESPACE should internally set the data.table threads to 1 and us the parallel package for parallelization. Should like it missed this in an occasion.

LQHHHHH commented 1 year ago

I found that this error arises in the init_genespace.R script during the execution of check_annotFiles, as well as in the run_genespace.R script while carrying out 9. Build pan-genes (aka pan-genome annotations) step. However, I failed to note down the specific line where this occurs. By the way, this error appeared when I ran genespace on two servers with Hyper-Threading, but it disappeared when I used a server without Hyper-Threading.

jtlovell commented 1 year ago

I should have a patch for this and a few other minor issues within the week. will tag here on the push.

noor-albader commented 1 year ago

I think I am receiving a similar data table error:

############################
7. Final block coordinate calculation and riparian plotting ...
    ##############
    Calculating syntenic blocks by reference chromosomes ... 
        n regions (aggregated by 25 gene radius): 26578
        n blocks (collinear sets of > 5 genes): 72892
    ##############
    Building ref.-phased blks and riparian plots for haploid genomes:
Error in rbindlist(mclapply(synhitFiles, mc.cores = nCores, function(j) { : 
  Item 10 of input is not a data.frame, data.table or list
In addition: Warning message:
In mclapply(synhitFiles, mc.cores = nCores, function(j) { :
  scheduled core 10 encountered error in user code, all values of the job will be affected

I tried the work around:

library("data.table")
data.table 1.14.8 using 28 threads (see ?getDTthreads).  Latest news: r-datatable.com
 setDTthreads(1)

Then re-launched and still got the same error (except for item 7 instead of 10):

############################
7. Final block coordinate calculation and riparian plotting ...
    ##############
    Calculating syntenic blocks by reference chromosomes ... 
        n regions (aggregated by 25 gene radius): 26578
        n blocks (collinear sets of > 5 genes): 72892
    ##############
    Building ref.-phased blks and riparian plots for haploid genomes:
Error in rbindlist(mclapply(synhitFiles, mc.cores = nCores, function(j) { : 
  Item 7 of input is not a data.frame, data.table or list
In addition: Warning message:
In mclapply(synhitFiles, mc.cores = nCores, function(j) { :
  scheduled core 7 encountered error in user code, all values of the job will be affected
jtlovell commented 1 year ago

are you still getting this error?

gubrins commented 1 year ago

same here in item 3... any solutions?