ay-lab / dcHiC

dcHiC: Differential compartment analysis for Hi-C datasets
MIT License
62 stars 10 forks source link

The program segfaults on ubuntu 22.04 #99

Open bhristov6 opened 6 months ago

bhristov6 commented 6 months ago

Hi, I installed dcHiC via the conda option and tried to call compartments in trans. After calculating the pca on chr1 the program segfaut-ed.

Written exp1_do_pca/inter_pca/exp1_do_mat/chr1.inter.txt

caught segfault address (nil), cause 'unknown'

Traceback: 1: functionsdchic::createtransijk(matpath, bedpath, paste0(outpath, "/", chr[l], ".inter.txt"), as.character(chr[l])) 2: FUN(X[[i]], ...) 3: lapply(1:length(chrom), extractTrans, chrom, normalizePath(df$mat[i]), normalizePath(df$bed[i]), paste0(df$prefix[i], "_pca/", "inter_pca/", df$prefix[i], "_mat")) 4: FUN(X[[i]], ...) 5: lapply(1:nrow(data), readfilesinter, data, pc, cthread, pthread, rowmrge, dirovwt) An irrecoverable exception occurred. R is aborting now ...

ay-lab commented 6 months ago

Hi,

Can you do a manual installation (option 2) and try it again?

bhristov6 commented 6 months ago

ok, will do and let you know.

ay-lab commented 6 months ago

Also, I noticed you're trying to run the trans-calling part. Have you tried to run the cis-calling part?

bhristov6 commented 6 months ago

When running with the cis-calling part it generated file for most chromosomes but threw this error:

Writing chr22 .txt file Calculating expected counts from chromosome wise background dist Weight 1 0 15655 2 100000 386 3 200000 261 4 300000 201 5 400000 264 6 500000 131 A B Weight chr1 pos1 chr2 pos2 dist WeightOE 1: 30428 30428 23 chr21 100000 chr21 100000 0 0.6875759 2: 30428 30429 33 chr21 100000 chr21 200000 100000 39.9248705 3: 30428 30430 25 chr21 100000 chr21 300000 200000 44.6360153 4: 30428 30431 56 chr21 100000 chr21 400000 300000 129.5522388 5: 30428 30432 21 chr21 100000 chr21 500000 400000 36.9090909 6: 30428 30434 27 chr21 100000 chr21 700000 600000 235.3584906 [1] 147 [1] 100000 [1] 147 Writing chr21 .txt file Calculating expected counts from chromosome wise background Error in aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...) : no rows to aggregate Calls: lapply ... aggregate -> aggregate.formula -> aggregate.data.frame Execution halted

ay-lab commented 6 months ago

Few tips -

  1. When you're re-running, please delete the old folder. Otherwise, it may generate some errors. If you haven't, please delete the old folders.

  2. Please remove instances of non-standard chromosomes like chrY and chrM from the bed files. Due to a lack of interactions and small size, it may not find the bins to aggregate and can throw errors like this.

bhristov6 commented 4 months ago

I followed your tips and now it runs smoothly on the cis-calling part but still segfaults on the trans-calling:

caught segfault address (nil), cause 'unknown'

Traceback: 1: functionsdchic::createtransijk(matpath, bedpath, paste0(outpath, "/", chr[l], ".inter.txt"), as.character(chr[l])) 2: FUN(X[[i]], ...) 3: lapply(1:length(chrom), extractTrans, chrom, normalizePath(df$mat[i]), normalizePath(df$bed[i]), paste0(df$prefix[i], "_pca/", "inter_pca/", df$prefix[i], "_mat")) 4: FUN(X[[i]], ...) 5: lapply(1:nrow(data), readfilesinter, data, pc, cthread, pthread, rowmrge, dirovwt) An irrecoverable exception occurred. R is aborting now ...

ay-lab commented 4 months ago

What resolution are you running the trans calls?

bhristov6 commented 4 months ago

100kb

ay-lab commented 4 months ago

Do you see any file under inter_pca/<prefix>_trans_mat?

I need to know if any specific chromosome is causing the issue.

bhristov6 commented 4 months ago

yes, there is a file created for the fist chromosome chr1.inter.txt which is about 450MB.

ay-lab commented 4 months ago

I suggested you the following on May 9th -

"Please remove instances of non-standard chromosomes like chrY and chrM from the bed files. Due to a lack of interactions and small size, it may not find the bins to aggregate and can throw errors like this."

Did you remove the non-standard chromosome from the bed files?

If it is yes then it's good for the cis-calling part only.

I think for trans we need to filter the matrix files too.

During cis calculation, dchic is only looking at the intra matrices.

However, during trans mode dchic is also looking at either chrM or chrY.

And if any of the chromosomes is smaller than 100Kb and/or has very few counts it can generate a segmentation fault.

bhristov6 commented 4 months ago

Yes, I have removed the non-standard chromosomes and I'm using only chr1-22.

ay-lab commented 4 months ago

Is it possible to share the data? Or at least a low-resolution like 1Mb data? I would like to reproduce the issue! abhiijit@lji.org

sotuamax commented 3 weeks ago

Is there a resolution for this issue? I am running the same "segfault" error when "--pcatype trans".

Best,

ay-lab commented 2 weeks ago

What resolution are you running this? I need to recreate the issue to resolve this.

sotuamax commented 2 weeks ago

Hello,

When I tried with different resolutions and provided in the same input data, the issue occurs. When only using one resolution, the issue is resolved. I think the input data for all the samples have to be the same resolution.

I do have an additional question about how to decide an optimal resolution for compartment call. Could you clarify?

Thank you for your help.

Best,

ay-lab commented 2 weeks ago

If you can raise a new issue regarding the optimal resolution, it would be great.
Thanks.