Open jaredo opened 9 years ago
Yes, joint calling is done in that way. See also the example in README:
fermi.kit/htsbox pileup -cuf ref.fa pre1.srt.bam pre2.srt.bam > out.raw.vcf
fermi.kit/k8 fermi.kit/hapdip.js vcfsum -f out.raw.vcf > out.flt.vcf
The second command line filters the calls.
Please note that pileup
is not true joint calling in that it doesn't use cross-sample information. It essentially combines single-sample VCF. Also note that fermikit is not designed for normal-tumor pairs. Some of its components may help, but the normal use would not work well.
Thanks, I'm using it for denovo calling in trios.
I am afraid that wouldn't work well, either. The problem with fermikit is that when it misses a variant, it misses completely. FNs in parents will lead to spurious de novo calls. In comparison, when a typical caller (e.g. gatk/samtools) misses a variant, you can usually see a few reads having the correct variants. This helps to reduce false de novo calls.
Probably the right way to perform normal-tumor and trio calling is to assemble the tumor/child and then map it against error corrected reads of normal/parents with fermi2 match -p
. I have not explored this approach yet.
PS: alternatively, you can use both fermikit and a typical de novo calling pipeline at the same time. You may require a de novo variant called by both approaches. Fermikit uses a very different method for variant calling. Combing distinct approaches usually helps to reduce false positives.
I see, I had not considered the FN issue and just thought the low FDR would be helpful. I will have a play around with your suggestions.
Hi Heng,
Great tool. Do you have any advice for joint calling of multiple samples?
I obtained a reasonable looking call set by first running your
run-calling
script on each sample individually. Then I ran the following on the bams produced:fermi.kit/htsbox pileup -cuf hs37d5.fa *.bam | bcftools view -Oz output.vcf.gz
Is this a sensible approach? Obviously filtering still needs to performed.
cheers,
Jared