Closed yuanlizhanshi closed 3 years ago
Hi, If I understand your question correctly, you have sequenced the parental lines (neither of which are the reference genome genotype) as well as the two F2 bulks. You are now curious about filtering out SNPs in the parental lines that are not in the reference genome.
I recommend the following pipeline: 1) Align one or both parents to the reference genome 2) call Variants for that parent vs the Reference genome 3) Extract only the SNPs and exclude any INDELs (the indels will shift your sequence positions). Make sure to keep only the highest quality and confidence SNPs 4) use the FastaAlternateReferenceMaker tool from GATK or another similar tool to apply the SNPs on the the reference genome and define an alternate fasta file. 5) Align and call SNPs from your F2 bulks vs the new fasta you've just created. 6) proceed with the analysis as usual.
Hope this helps, Ben
Got it ,Thank you very much.
I know the QTLseqr is based on BSA-seq,it can calculate the 2 F2 sample's SNP -index. but usually we have 4 samples or more,which contains P1&P2 and 2 X F2 sample. How to filter out the SNPs that P1 & P2 & F2 contains, Then calculate the SNP -index. I will be appreciate that you consider my questions.