recommended or default p-value/p-adj cutoff

ay-lab / dcHiC

dcHiC: Differential compartment analysis for Hi-C datasets

MIT License

55 stars 10 forks source link

recommended or default p-value/p-adj cutoff #97

Closed usernicai closed 4 months ago

usernicai commented 4 months ago

I noticed that there was no difference between the intra_compartment.bedGraph file in the viz folder and the differential.intra_sample_group.pcOri.bedGraph file in the fdr_result folder, Does that mean I can use both when I call AB. Furthermore，how to get differential.filter. Intra_sample_group. Filtered. PcOri. BedGraph file from differential.Intra_sample_group.PcOri.bedGraph, The document says "filtered by a p-value cutoff," but I didn't specify a threshold when I executed the script. I just want to make sure, can I directly use differential intra_sample_group. Filtered. PcOri. BedGraph to call AB, or my own defined threshold, That is, using the intra_compartment.bedGraph file in the viz folder or the differential.intra_sample_group.pcOri.bedGraph file in the fdr_result folder,What is the recommended or default p-value/p-adj cutoff? Thank you so much!

ay-lab commented 4 months ago

The default fdr threshold is 0.1. You can set the --fdr <cutoff> as required.

I would suggest keeping the default parameter and later you can post filter the PcOri.BedGraph file as you like.

In our manuscript, we were able to detect all the known and other (supported by the independent evidences) compartmental changes within fdr < 0.1 value. You should be able to detect all the major changes within this threshold.

Let me know if you have any other concerns.

usernicai commented 4 months ago

@ay-lab Thank you very much for your answer！ I have one more question about sequencing replicate. I have two types of data at hand, one is HMEC (normal breast sample with two replicates) and the other is TNBC (triple negative breast sample with a total of three cell line representations with two replicates per cell line). What I want to ask is should I treat the hic data of the different cell lines in TNBC as a type of duplicate, that is, the case that is finally grouped into 2 vs 6 rep in the dchic input.TXT file, or should I separate the comparison of each cell line, that is 3 ——2 vs 2 cases, I'm just looking for differences between TNBC and HMEC compartment，the differential compartment，For this core purpose, does heterogeneity between replicates matter much if it is a 2 vs 6 operation？Thank you so much!

ay-lab commented 4 months ago

Hi,

The heterogeneity within replicates will impact the differential calls between sample comparison. dcHiC expects to have conserved compartment calls within replicates and penalizes the p-value if the heterogeneity increases significantly within replicates. So, if you're comparing the 2 vs 6, then it will be the most stringent comparison and the most confident one you can think of.

usernicai commented 4 months ago

Hi,

The heterogeneity within replicates will impact the differential calls between sample comparison. dcHiC expects to have conserved compartment calls within replicates and penalizes the p-value if the heterogeneity increases significantly within replicates. So, if you're comparing the 2 vs 6, then it will be the most stringent comparison and the most confident one you can think of.

Thank you very much. I think my problem has been solved