Closed Leehyeonjin93 closed 4 years ago
According to feedback from users, researchers have different filtering standards. It did indicate type 0 is less confident according to supporting reads, although i will not suggest to directly filter them out. chr and position did indicate the breakpoints as most tools does. the START & END you highlighted below are the information on the inserted TEs but not human coordinates. As the example showed, the predicated HERV's insertion position will be chr1:5617379.
I will suggest to filter low quality TE genotypes, such as GQ>20 or others; check if a called TE insertions located within same type of reference TE; filter by total depth of supporting reads (DPI) according to your depth; filter out TEs with extremely long TSD, which is rare but may still real, unless you want to analyze this group.
Xun
thanks, I have other questions, I didn't find gatk process, like markduplicates, realign, recalibration, on your paper. why didn't use it?
Thanks for your suggestions.
We want to make ERVcaller easy to install and use. We did consider to include QC steps, although they can be preprocessed by many other tools. And we also assumed most users would already perform read QC separately, such as removing redundant reads, low-quality reads, adaptor sequences etc. thanks again, we may reconsider to include those steps in our next version.
We performed a realignment process as well in ERVcaller, which may be slightly different with GATK.
We did not have the recalibration process because it is for evaluating quality scores for SNPs, InDels but not for TE insertions as I know.
Thanks, Xun
Thanks for perfect answer.
thank you for developing great tool. I had some final vcf files. I didn't find additional filters and others filters. So, status of detected TE : 0 to 5, type 0 is it ok? and Are chr and position in vcf breaking point? I don't understand where is breaking point, what are START & END means. for example,
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT TE_seq
chr1 5617379
. T1,7831,7831
,+,4;CR=64;SR=3;GTF=YES;GR=1.000 GT:GQ:GL:DPN:DPI 1/1:40:0,0,1:0:67where is HERVK's insertion position?