Open zhangjinpengGithub opened 3 years ago
GangSTR uses coverage in order to estimate the length of large repeat expansions. If the coverage is relatively uniform in a region around the locus of interest, GangSTR should be able to perform genotyping. Otherwise, the results are probably not accurate.
I have discovered through samtools tview that RAD-seq does have not uniform distribution in a SSR region. In other words, some reads do not cover the whole SSR region, I would like to ask if you know how to filter these reads in the BAM file? Or how to solve this kind of problem?
It is ok if the reads not all reads cover the entire repeat region, what matters is that if there if the coverage is relatively constant in a larger region (5-10KB) or if it fluctuates. If the first case is correct, you can supply the constant coverage to GangSTR with --coverage
(no need to filter the Bam file). Otherwise, GangSTR won't be able to report an accurate genotype.
My experimental data is not whole genome sequencing but restriction site-associated DNA sequencing (RAD-seq), and I would like to know if I can use your software under such conditions. Thanks again!