ay-lab / dcHiC

dcHiC: Differential compartment analysis for Hi-C datasets
MIT License
57 stars 10 forks source link

Questions regarding running dcHiC #41

Closed sdontsay closed 1 year ago

sdontsay commented 1 year ago

Hi Ay,

I'm trying to run dcHiC on a Human genome dataset, unfortunately, I got several errors and couldn't get what I want.

  1. I have an error message when running the first step, saying that "Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: Two levels of parallelism are used. See ?assert_cores. Calls: lapply ... clusterApply -> staticClusterApply -> checkForRemoteErrors Execution halted " I guess this might be due to the occurrence of chromosomes of Y, M, Z in my data (I do have these chromosomes), but I still wanna ask your opinion on it.

  2. In the second step, which selects the best pc, I have an error saying that "Error: Executable for bedtools not found! Please make sure that the software is correctly installed and, if necessary, path variables are set." I'm not familiar with this tool, and I did not see you mention it as a prerequisite for this analysis, so I searched for it and tried installing it on the HPC I am using. However, because I don't have the administration authorization, I still cannot call it when running the program, thus the error is still there. Do you have any suggestions?

  3. Although I have 12 .matrix and .bed files in the data folder, I only have the output of the first one listed in the input file (named NT1_20kb_pca), do you have any idea why was that?

  4. I guess the other errors I encountered in the subsequent steps are the consequences of step 2.

Moreover, the content of my input file is as follows, NT1_20000.matrix NT1_20000_abs.bed NT1_20Kb NT NT2_20000.matrix NT2_20000_abs.bed NT2_20Kb NT PT1_20000.matrix PT1_20000_abs.bed PT1_20Kb PT PT2_20000.matrix PT2_20000_abs.bed PT2_20Kb PT PT3_20000.matrix PT3_20000_abs.bed PT3_20Kb PT PT4_20000.matrix PT4_20000_abs.bed PT4_20Kb PT PT5_20000.matrix PT5_20000_abs.bed PT5_20Kb PT RT1_20000.matrix RT1_20000_abs.bed RT1_20Kb RT RT2_20000.matrix RT2_20000_abs.bed RT2_20Kb RT RT3_20000.matrix RT3_20000_abs.bed RT3_20Kb RT RT4_20000.matrix RT4_20000_abs.bed RT4_20Kb RT RT5_20000.matrix RT5_20000_abs.bed RT5_20Kb RT

Thanks!

ay-lab commented 1 year ago

Hi,

Thanks for trying out the package. It seems like you don't have the 'parallel' R package. If not, then please install it from here https://cran.r-project.org/web/packages/parallelly/index.html or type install.packages('parallel') within the R environment. I hope this will resolve the first issue.

Regarding the second issue, please install the 'bedtools' package from here https://bedtools.readthedocs.io/en/latest/ and please make sure the tool is in your $PATH. This is a must package to have for dchic.

I hope the rest of the issues will be resolved once you follow the above instructions.

sdontsay commented 1 year ago

Hi Ay,

Thanks for your reply! I actually have the parallel package installed in R, and I asked for help from the HPC staff of our university for installing bedtools. It is now installed, and I can load it, and call it, but the error message is still there. Do you have any other advice?

Thanks!

ay-lab commented 1 year ago

Hi,

You can try out two options - first, you can run with '--cthread 1 --pthread 1' option, so that you will avoid using the clusterApply part. The second option is to log into your HPC system in interactive mode and run dchic with '--cthread 2 --pthread 2' option. Let's try to resolve this issue, I can then help you to resolve the next.

sdontsay commented 1 year ago

Thank you, Ay. I tried the first option, and I got a new error, "Performing Z transformation : complete! Performing block wise correlation calculation Error in functionsdchic::oe2cor(mat, (start - 1), (end - 1), 1, 0) : upper value must be greater than lower value Calls: lapply -> FUN -> lapply -> FUN -> Execution halted".

ay-lab commented 1 year ago

Thanks for trying out the option. It seems like an issue either with the file format or the presence of a non-conventional chromosome with very few counts. Do you mind sending me the input bed file and a slice of the matrix files to abhijit@lji.org?

sdontsay commented 1 year ago

Thank you, Ay, for this kind suggestion! However, because this is unpublished data yet, we may need some time to think about it.

ay-lab commented 1 year ago

An alternative option is to run the PCA calculation separately (using HOMER/juicer) and use dcHiC to call the differential compartments. Please check the "Using Existing PC values with dcHiC" option under the utility folder.