Prior normalization ? - Githubissues

zyqfrog10 commented 5 years ago

Hello,

I have a question on the matrix normalization.

Although in your paper, you have clearly stated that "(existing methods) is fundamentally flawed, because the inherent biases present in the Hi-C data are very hard to model and completely eliminate", I'am curious how prior normalization would affect SELFISH results? (I saw an option for biases in the python SELFISH. Have you tested/compared different correcting algorithms?)

Besides, how to determine the optimal value of sigmaZero (-sz)?

Thank you, YZ

biozzq commented 3 years ago

Dear @ay-lab

I am now learning how to better use selfish, and have the same question about the input. From some preprints (following two), I am confused about the input. I do not know if I need do some normalizations (e.g., ICE normalization, and sequence depth) for the contact matrix before running selfish. In my mind, I think I need pass the raw counts and all above normalizations should have been considered in selfish. Hope you can help me. Thank you very much.

"Differential interaction analysis was carried out at 10 kb resolution for each replicate using selfish (Ardakany et al., 2019) based on ICE-normalized interaction counts from the Hi-C Pro all-valid-pairs matrix and reformatted into the “.hic” file-format using Juicebox (https://www.biorxiv.org/content/10.1101/2020.10.26.352583v3.full.pdf+html)"

"Selfish does not request that the data be normalized. Therefore, we simply ran Selfish at default settings, and evaluated significance based on BH-adjusted P-values. (https://www.biorxiv.org/content/10.1101/2020.09.03.281972v1.full.pdf+html)"

Sincerely, Zheng zhuqing

ay-lab commented 3 years ago

Hi, Zheng, you will get better results if you do normalization before feeding the data to selfish.

biozzq commented 3 years ago

Dear @ay-lab

Thank you. Could you tell us which normalizations should be done before running selfish?

Best wishes, Zheng zhuqing

biozzq commented 3 years ago

Dear @ay-lab

I tried to normalize (by sequence depth) and correct (by ICE or KR) the cool file by using HiCExplorer. Can this corrected cool file be used as input directly, and without providing bias file? For example, running like following; selfish -f1 correct.cool -f2 correct.cool -o $out -t 0.05 -ch $chr -r $res

Sincerely, Zheng zhuqing

biozzq commented 3 years ago

Dear @ay-lab

I hope this finds you well. Could you help me with this issue, thank you very much.

Sincerely, Zheng zhuqing

ay-lab commented 3 years ago

Hi Zheng, you don't need to provide bias files for .cool inputs BUT selfish assumes that the .cool file provides 'weight' columns (which actually gives the biases) and raw counts and cooler generates the normalized values itself when selfish reads from it. I'll add the option for your case and will let you know.

ay-lab commented 3 years ago

I updated the code. You just need to add this "-nb".

ay-lab / selfish

Prior normalization ? #3