sr320 / ceabigr

Workshop on genomic data integration with a emphasis on epigenetic data (FHL 2022)
4 stars 2 forks source link

QC / next step on RNAseq SNPs #65

Closed sr320 closed 9 months ago

sr320 commented 2 years ago

I have run the following code https://github.com/sr320/ceabigr/blob/main/code/11-RNAseq-snps.Rmd and think I am ready to move forward.

In short

cd /home/shared/8TB_HDD_01/sr320/github/ceabigr/output/11-RNAseq-snps
echo "Hard filtering variants"
/home/shared/gatk-4.2.5.0/gatk VariantFiltration \
-R /home/shared/8TB_HDD_01/sr320/github/ceabigr/data/Cvirginica_v300.fa \
-V Cv-rnaseq_genotypes.vcf.gz \
-O Cv-rnaseq_genotypes-filtered.vcf.gz \
--filter-name "FS" \
--filter "FS > 60.0" \
--filter-name "QD" \
--filter "QD < 2.0" \
--filter-name "QUAL30" \
--filter "QUAL < 30.0" \
--filter-name "SOR3" \
--filter "SOR > 3.0" \
--filter-name "DP15" \
--filter "DP < 15" \
--filter-name "DP150" \
--filter "DP > 150" \
--filter-name "AF30" \
--filter "AF < 0.30" >> "Genotype_filter_stout.txt" 2>&1

produced: https://gannet.fish.washington.edu/seashell/bu-github/ceabigr/output/11-RNAseq-snps/Cv-rnaseq_genotypes-filtered.vcf.gz

# Select only SNPs that pass filtering
cd /home/shared/8TB_HDD_01/sr320/github/ceabigr/output/11-RNAseq-snps

echo "Selecting SNPs that pass fitering"
/home/shared/gatk-4.2.5.0/gatk SelectVariants \
-R /home/shared/8TB_HDD_01/sr320/github/ceabigr/data/Cvirginica_v300.fa \
-V Cv-rnaseq_genotypes-filtered.vcf.gz \
--exclude-filtered TRUE \
--select-type-to-include SNP \
-O Cv-rnaseq_genotypes-filtered.true.vcf.gz \
 >> "SelectVariants_stout.txt" 2>&1

echo "complete!"

produced: https://gannet.fish.washington.edu/seashell/bu-github/ceabigr/output/11-RNAseq-snps/Cv-rnaseq_genotypes-filtered.true.vcf.gz