yezhengSTAT / FreeHiC

FreeHi-C pipeline for high fidelity Hi-C data simulation.
MIT License
8 stars 9 forks source link

Simulating Hi-C from just a modified reference genome #7

Open sanjitsbatra opened 1 year ago

sanjitsbatra commented 1 year ago

Hi! If I have created a modified reference genome by introducing additional chromosomes that are, for instance, translocations between chr1 and chr12, is it possible to simulate Hi-C data using just the modified reference genome sequence (we of course can have access to Hi-C data from a karyotypically normal cell line such as GM12878)? Could you please outline the steps to do so? We were able to simulate Hi-C data using such a modified reference genome using Sim3C but the maps don't contain higher-order chromatin patterns as seen in your first supplementary figure. Thank you!

yezhengSTAT commented 1 year ago

Hello,

Sorry for the late reply!

This is quite an interesting and unique application of FreeHi-C! Although I haven't tried using FreeHiC to study translocation or SV, I think it can be simulated in two ways: 1. simulate with respect to a normal Hi-C matrix and align it to your modified reference genome (this minics the scenario, for example, a fraction of chr1 translocated to chr12 but still interact with other fragments of chr1); 2. simulate with respect to a translocated Hi-C matrix (such as in those structure variation study, you will see a lot of trans-interaction across chromosome) and align to the normal reference genome.

I guess you will need to design it to fit your study target. What FreeHi-C can do is to minic the input Hi-C data and add randomness. If your input Hi-C matrix is normal, then the simulated ones look like a normal Hi-C matrix. If you want an abnormal Hi-C matrix (either with spike-in or translocation or SV), you will need to provide an "abnormal" Hi-C matrix.

Hope it helps.

Thanks, Ye