I have run the CTAT pipeline on my samples. I have a couple of questions about the filtering and annotation steps:
Considering the boosting method=none (hard filtering) why you did not consider indels? In addition, why did you consider only variants annotated in chromosomes? Why did you not include scaffolds?
I was wondering why you did not include the gatk variantRecalibration step in the pipeline. I was wondering if it is because there is not yet available for RNA-seq the truth of data necessary for training and to obtain the VQSR and CNNScoreVariants.
In addition I ran the pipeline giving as input files the vcf file and the bam file obtained from the gatk ApplyBQSR step with clipped overlapping read. I was wondering if during the step "PASS read annotations" all variants that are less than 6 bases from the ends of the reads are filtered out with the script annotate_PASS_reads.py.
Hi!
I have run the CTAT pipeline on my samples. I have a couple of questions about the filtering and annotation steps:
Considering the boosting method=none (hard filtering) why you did not consider indels? In addition, why did you consider only variants annotated in chromosomes? Why did you not include scaffolds?
I was wondering why you did not include the
gatk variantRecalibration
step in the pipeline. I was wondering if it is because there is not yet available for RNA-seq the truth of data necessary for training and to obtain the VQSR and CNNScoreVariants.In addition I ran the pipeline giving as input files the vcf file and the bam file obtained from the
gatk ApplyBQSR
step with clipped overlapping read. I was wondering if during the step "PASS read annotations" all variants that are less than 6 bases from the ends of the reads are filtered out with the scriptannotate_PASS_reads.py
.Thank you for your help!
Concetta