yezhengSTAT / FreeHiC

FreeHi-C pipeline for high fidelity Hi-C data simulation.
MIT License
8 stars 9 forks source link

How to simulate hic reads of equal length of 150bp? #11

Open xujialupaoli opened 4 weeks ago

xujialupaoli commented 4 weeks ago

Hello! Thank you for providing such a useful software.

My original hic data are all paired reads of equal length of 150bp. I hope that the simulated data are also regular 150bp paired reads.

But when I use freehic for simulation, the reads I get are long and short, unevenly distributed reads. Please help me see how I should simulate?

This is my FreeHiC_parameters file:

projDir="/home/work//HIC/third/hap1/FreeHiC"
fastqFile="/home/work/HIC/fq_sub/freehictestSub"
ref="/home/work/pecat_hap1_NoGap.fa"
refrag="/home/work/HIC/hap1/HiC.bed"
simuName="demoSimulation"
outDir="${projDir}/results"
summaryFile="${projDir}/summary/${simuName}_FreeHiC.summary"

bwa="bwa"
samtools="samtools"
bedtools="bedtools"

train=1
simulate=1
postProcess=0
coreN=48
mismatchN=3
gapN=1
mismatchP=""
gapP=""
chimericP=""
simuN=50000000
readLen=150
resolution=10000
lowerBound=$((resolution*2))
refragU=500
ligateSite="GATCGATC"

This is my original sequencing hic data: image

This is the hic reads with uneven length distribution that I simulated: image

Looking forward to your reply!

yezhengSTAT commented 4 weeks ago

Hello,

Yes, you would expect to see much shorter reads or slightly longer reads due to chimeric reads and various mutations. If you enable the chimeric read simulation scenario, the program will randomly choose some reads and treat them as chimeric reads, i.e., reads that expand the restriction cutting sites and sequence into the interacting fragment from a distal region. Before alignment, the DNA sequences from the distal interacting fragment will be trimmed. Therefore, the simulated chimeric reads are like the real chimeric reads, shorter than the original read length. As for longer reads, it is possibly due to insertion mutation added to the simulated sequence.

Hope the above makes sense to you.

Thanks, Ye

xujialupaoli commented 4 weeks ago

OK, thank you very much for your patience in explaining!