labdevgen / 3Dpredictor

3DPredictor - a computational approch to predict spatial interactions of chromatin based on epigenetic data
9 stars 9 forks source link

some questions #66

Open ahmedabbas811 opened 1 year ago

ahmedabbas811 commented 1 year ago

Dear all, Why the predicted and real contact frequencies in the '.scc' file are too small (~0.0001)? And how can I get values like the original values in the contact.gz files?

Thanks

polyaB commented 1 year ago

1) Contact values are so small because we divided each contact value by normalized coefficient which we calculated for each cell type. You can find them here https://github.com/labdevgen/3Dpredictor/tree/master/input/normalized_coefficients . The calculation method with specific formula you can find in the Methods Hi-C data processing section in our paper "Quantitative prediction of enhancer-promoter interactions". Genome Res. 2020 Jan;30(1):72-84. doi: 10.1101/gr.249367.119. 2) You can get these files from .hic file using Juicer tools dump

ahmedabbas811 commented 1 year ago

Thank you,

I have one more question,

Suppose I train with odd chr and test with even chr. Suppose I test with chr 8 which has in the contacts.gz file, it has 1013818 loops

the output ".scc" which has original loops vs predicted ones to calculate the SCC value, this file has 324531 loops only. Is it expected to predict this less number of loops? If expected, what is the reason? if not expected, what can be the mistake that I did in running the program?

Thanks a lot

polyaB commented 1 year ago

It happens because we predicted only contacts  with the distance between loci less than 1.5 Mb.   The contacts.gz file consists all contacts of the chromosome including long-range contacts.     

С уважением, Полина Белокопытова     PhD student, NSU, Department of natural science, Genetics  

Genomic Mechanisms of Development group ICG SB RAS    

Вторник, 20 июня 2023, 9:26 +07:00 от ahmedabbas811 @.***>:     Thank you, I have one more question, Suppose I train with odd chr and test with even chr. Suppose I test with chr 8 which has in the contacts.gz file, it has 1013818 loops the output ".scc" which has original loops vs predicted ones to calculate the SCC value, this file has 324531 loops only. Is it expected to predict this less number of loops? If expected, what is the reason? if not expected, what can be the mistake that I did in running the program? Thanks a lot — Reply to this email directly, view it on GitHub , or unsubscribe . You are receiving this because you commented. Message ID: <labdevgen/3Dpredictor/issues/66/1598024681 @ github . com>