tanlongzhi / dip-c

Tools to analyze Dip-C (or other 3C/Hi-C) data
61 stars 18 forks source link

About single cell chromatin A/B compartment clustering #42

Closed AmyTanJ closed 2 years ago

AmyTanJ commented 2 years ago

Hi @tanlongzhi, Hope you are doing good!

I have read the definition of single-cell chromatin Compartment in your paper, which is that

the single-cell chromatin compartment for each 1-Mb bin was calculated as the average CpG frequency of all other bins contacting it, weighted by the number of contacts.

And I'm confused about it because I do not find relevant weighting steps in color2.py. How do you weigh it? Do you mean that e.g. if two bins have 3 contacts, the corresponding CpG frequency should be multiplied by 3, or other?

Any help again will be really great!

tanlongzhi commented 2 years ago

Hi @AmyTanJ, Thank you for your interest. In color2.py, the main loop is looping over each contact. In this way, if a pair of bins have 3 contacts, the CpG frequency will be naturally counted (weighted) 3 times. Does it make sense? Best, Tan

AmyTanJ commented 2 years ago

Thank you for your reply@tanlongzhi When we finally calculate the average CpG frequency of a bin, are these 3 contacts included as a whole or independent? Is it going to be in the corresponding denominator of 1 or 3?

tanlongzhi commented 2 years ago

Hi @AmyTanJ, in your example, the 3 contacts are included independently, i.e., as 3 identical CpG frequency values in the list.

In particular, in my code, the CpG frequency value of every contact is independently appended to a list in Line 96 and Line 100. The final averaging step (taking the mean of each list) is done in Line 104.

AmyTanJ commented 2 years ago

Hi@tanlongzhi Thanks a lot! I have understood :)