ay-lab / dcHiC

dcHiC: Differential compartment analysis for Hi-C datasets
MIT License
57 stars 10 forks source link

Error in subcompartment analysis #35

Closed csijcs closed 2 years ago

csijcs commented 2 years ago

I am having an issue with subcompartment analysis step of the dchicf.r script. When I run:

Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype subcomp --dirovwt T --diffdir all_samples_40Kb

I get the error:

Error in `[<-.data.frame`(`*tmp*`, , "state", value = 1:6) :
  replacement has 6 rows, data has 3
Calls: subcompartment -> hmmsegment -> [<- -> [<-.data.frame
Execution halted

The subcompartment analysis runs for many chromosomes and samples, but then fails when it starts chr4 for the first sample in the input.txt

Prior to this I successfully ran:

Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype cis --dirovwt T --cthread 2 --pthread 4 --genome hg38 --fdr 0.05
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype select --dirovwt T --genome hg38
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype analyze --dirovwt T --diffdir all_samples_40Kb
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype dloop --dirovwt T --diffdir all_samples_40Kb

I have also successfully run the subcompartment analysis on subsets of these samples, including for chr4 on the first sample in the input.txt, but I get this error when running all samples together.

ay-lab commented 2 years ago

Thanks for raising the issue separately. Have you tried to visualize the chromosome 4 normalized PC scores using "viz" command? If not, please try the following -

Rscript dchicf.r --file 40kb_input.txt --pcatype viz --diffdir all_samples_40Kb --genome hg38 --pcgroup pcQnm

This will help me to debug the issue.

csijcs commented 2 years ago

When I run: Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype viz --diffdir all_samples_40kb --genome hg38 I get the following:

Warning messages:
1: In dir.create(paste0(diffdir, "/viz")) :
  cannot create dir 'DifferentialResult/all_samples_40kb/viz', reason 'No such file or directory'
2: In dir.create(paste0(diffdir, "/viz/files")) :
  cannot create dir 'DifferentialResult/all_samples_40kb/viz/files', reason 'No such file or directory'

But it appears to be an error and not just a warning because the viz folder is not created

ay-lab commented 2 years ago

Did you successfully execute this step? Do you see the DifferentialResult/all_samples_40kb/fdr_result directory?
Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype analyze --dirovwt T --diffdir all_samples_40Kb

Because the previous error says it didn't find the DifferentialResult/all_samples_40kb directory. The analyze step should create the DifferentialResult/all_samples_40kb directory and the viz step expects it.

csijcs commented 2 years ago

I did successfully execute that step and the folders DifferentialResult/all_samples_40kb and DifferentialResult/all_samples_40kb/fdr_resultdo exist.

ay-lab commented 2 years ago

Do you mind sharing the directory architecture of DifferentialResult/all_samples_40kb?
Something like a tree command will be sufficient, I just want to look at the generated files. Till now, I am not exactly sure what actually caused this error (Never seen this error before). Sorry for all the inconvenience.

csijcs commented 2 years ago

It contains 10 *_data folders for the 10 samples, which each contain and *_intra_chr.pc.bedgraph for each chromosome and repliacte, as well as an intra_chr_combined.pcOri.bedgraph and intra_chr_combined.pcQnm.bedgraph for each chromosome.

It also contains pcOri and pcQnm folders with intra_sample_chr_combined.*.begraph for each chromosome.

The DifferentialResult/all_samples_40Kb/fdr_result folder contains the following files:

differential.intra_compartmentLoops.bedpe                differential.intra_sample_chr16_combined.pcQnm.bedGraph  differential.intra_sample_chr2_combined.pcQnm.bedGraph  differential.intra_sample_chrX_combined.pcQnm.bedGraph
differential.intra_compartmentLoops.txt                  differential.intra_sample_chr17_combined.pcQnm.bedGraph  differential.intra_sample_chr3_combined.pcQnm.bedGraph  differential.intra_sample_combined.Filtered.pcQnm.bedGraph
differential.intra_sample_chr10_combined.pcQnm.bedGraph  differential.intra_sample_chr18_combined.pcQnm.bedGraph  differential.intra_sample_chr4_combined.pcQnm.bedGraph  differential.intra_sample_combined.pcQnm.bedGraph
differential.intra_sample_chr11_combined.pcQnm.bedGraph  differential.intra_sample_chr19_combined.pcQnm.bedGraph  differential.intra_sample_chr5_combined.pcQnm.bedGraph  differential.intra_sample_group.Filtered.pcOri.bedGraph
differential.intra_sample_chr12_combined.pcQnm.bedGraph  differential.intra_sample_chr1_combined.pcQnm.bedGraph   differential.intra_sample_chr6_combined.pcQnm.bedGraph  differential.intra_sample_group.Filtered.pcQnm.bedGraph
differential.intra_sample_chr13_combined.pcQnm.bedGraph  differential.intra_sample_chr20_combined.pcQnm.bedGraph  differential.intra_sample_chr7_combined.pcQnm.bedGraph  differential.intra_sample_group.pcOri.bedGraph
differential.intra_sample_chr14_combined.pcQnm.bedGraph  differential.intra_sample_chr21_combined.pcQnm.bedGraph  differential.intra_sample_chr8_combined.pcQnm.bedGraph  differential.intra_sample_group.pcQnm.bedGraph
differential.intra_sample_chr15_combined.pcQnm.bedGraph  differential.intra_sample_chr22_combined.pcQnm.bedGraph  differential.intra_sample_chr9_combined.pcQnm.bedGraph
ay-lab commented 2 years ago

Hi,

So, I just noticed that there is a difference in the name of diffdir between two of yours previous dcHiC commands - Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype analyze --dirovwt T --diffdir all_samples_40Kb & Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype viz --diffdir all_samples_40kb --genome hg38

The first one is 40KB and the second one is 40kb and probably this is the reason you're getting an error during the viz step. Try this - Rscript ~/dcHiC/dchicf.r --file 40kb_input.txt --pcatype viz --diffdir all_samples_40Kb --genome hg38

It is strange to me that the "viz" step is failing. This step only requires one of the differential.intra_sample_group.*.bedGraph files, which is successfully generated. If the new command still fails, can you please send me the differential.intra_sample_group.pcOri.bedGraph and differential.intra_sample_group.pcQnm.bedGraph files to abhijit@lji.org? I like to replicate the error and give you some meaningful comments.

csijcs commented 2 years ago

Wow, I'm embarrassed. Indeed it was 40Kb instead of 40kb. Really sorry to take up your time over a typo. Thanks very much for your help

ay-lab commented 2 years ago

No issues! happy to help you out. If you still facing the subcompartment issue, please let me know.