jtlovell / GENESPACE

Other
180 stars 24 forks source link

Step 4 synteny error #162

Open Hannah1746 opened 1 month ago

Hannah1746 commented 1 month ago

I am running GENESPACE and keep running into this error and I can't seem to pin down what is causing the error.

############################

  1. Flagging synteny for each pair of genomes ...

    Chunk 1 / 2 (02:10:40 PM) ...

    Error in rbindlist(mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { : Item 1 of input is not a data.frame, data.table or list Calls: run_genespace -> synteny -> rbindlist -> lapply -> FUN -> rbindlist In addition: Warning message: In mclapply(1:nrow(chnk), mc.cores = nCores, function(i) { : scheduled core 1 encountered error in user code, all values of the job will be affected Execution halted

I know what individual is causing the problem but the bed and protein fasta inputs don't seem to have anything wrong with them.

I would love some help trying to debug this if you have time.

jtlovell commented 1 month ago

Please try running with nCores = 1 and reporting the error. Usually this happens when there is no synteny, but there are other possible causes.

Hannah1746 commented 1 month ago

Here is the new error: ############################

  1. Flagging synteny for each pair of genomes ... Error in FUN(X[[i]], ...) : object 'outHits' not found Calls: run_genespace ... lapply -> FUN -> rbindlist -> mclapply -> lapply -> FUN Execution halted I know for a fact there is synteny between all my individuals. The one that is causing issues (DR) has one to every two genes ratio with my others. input code:

wd = "/mnt/krab3/catostomid_GENESPACE" setwd(wd)

path2mcscanx <- "/home/krablab/Documents/apps/MCScanX"

gpar <- init_genespace( wd = wd, path2mcscanx = path2mcscanx, genomeIDs = c("DR","M.asiaticus", "X.texanus", "H.nigricans","C.commersonii", "M.valenciennesi"), ploidy = c(0,1,1,1,1,1), nCores = 1 )

out <- run_genespace(gpar, overwrite = T)

jtlovell commented 1 month ago

I never thought to check for ploidy > 0 ... can you try that, ploidy = c(0,1,1,1,1,1) + 1

Hannah1746 commented 1 month ago

So this moved me forward but I am still getting an error:

  1. Integrating syntenic positions across genomes ... ############## Generating syntenic dotplots ... Done! ############## Interpolating syntenic positions of genes ... Drer: (0 / 1 / 2 / >2 syntenic positions) Error in vecseq(f, len, if (allow.cartesian || notjoin || !anyDuplicated(f__, : Join results in 342554 rows; more than 338510 = nrow(x)+nrow(i). Check for duplicate key values in i each of which join to the same group in x over and over again. If that's ok, try by=.EACHI to run j for each group to avoid the large allocation. If you are sure you wish to proceed, rerun with allow.cartesian=TRUE. Otherwise, please search for this error message in the FAQ, Wiki, Stack Overflow and data.table issue tracker for advice. Calls: run_genespace ... merge -> merge.data.table -> [ -> [.data.table -> vecseq In addition: There were 50 or more warnings (use warnings() to see the first 50) Execution halted

The thing is I have used the bed and fasta before to plot and it work but now it is not working. When I take it out I can also get it to run.

I am sorry for taking up so much of your time!!!

jtlovell commented 1 month ago

Its alright ... how do the dotplots look? Is it possible there is no synteny?

Hannah1746 commented 1 month ago

No there is synteny. All the dotplots show that. Here are a couple of them: Drer_vs_Gyro.syntenicHits.pdf Drer_vs_H.nigricans.syntenicHits.pdf X.texanus_vs_Drer.syntenicHits.pdf X.texanus_vs_Gyro.syntenicHits.pdf

jtlovell commented 1 month ago

pls send me an email so I can troubleshoot your run. jlovell [at] hudsonalpha [dot] org

jtlovell commented 1 month ago

OK - there is something funky with your run that was causing there to be duplicated block coordinates ... I couldn't figure out what was causing that, but I did just commit a change to master that now runs through your genomes without erroring out.