Open Mailinnia opened 3 years ago
Since the coverage is very high, the strand-bias filter removes SNPs that have a very slight strand-bias. One solution is to use a lower p-value threshold (e.g. 0.0001) that should only remove SNPs with a strong strand-bias. Also, we can address this in longshot by adding an additional criteria for filtering SNPs due to strand-bias that also uses the magnitude of the bias.
How do I filter based on the magnitude of the bias?
This is not yet implemented in longshot and the output VCF file does not have information to calculate the magnitude of the bias.
Hi,
I am having an issue with longshot not calling SNPs that I know to be true. They seem to be filtered out based on strand bias (FILTER sb).
I'm using the following command: longshot -r CYP2D6:2600-8700 --bam NA23348.sorted.bam --ref .CYP2D6.NG008376.4.fasta --out PCR_0.01.vcf --no_haps --strand_bias_pvalue_cutoff 0.01 --min_alt_frac 0.2 -d longshot0.01_debug
If I set strand_bias_pvalue_cutoff 0.0, it of course finds the SNPs, but also gives SNPs where there is clear strand bias.
This SNP should be found but isn't at cut-off 0.01: I don't understand why it is claiming there is strand bias when the alternate G is found in 609+ and 739- strands. That doesn't seem biased to me. I am losing several true SNPs due to this type of 'strand bias'.
I under stand this one, where there is clear strand bias (alternate C in 3+ and 827- strands):
What can I do to mitigate this issue?