HKU-BAL / Clair3

Clair3 - Symphonizing pileup and full-alignment for high-performance long-read variant calling
246 stars 27 forks source link

Missing variants in one contig #216

Closed mproberts99 closed 1 year ago

mproberts99 commented 1 year ago

Hi, I am using Clair3 for variant calling on targeted amplicon sequencing data. Since we are using the R10 flowcell and the model provided by ONT was trained at 60x depth, we subsample the bam to 60x prior to variant calling. When the caller goes through pileup calling, it does not identify any variant candidates in one of the contigs (chr6), despite there being obvious variants in the bam file when viewing in IGV. This has happened to at least two samples. I have attached the log and can send the VCF/bam files as needed for troubleshooting. run_clair3.log

aquaskyline commented 1 year ago

It looked like no candidate was found in chr6. If you want us to take a deeper look, please send us a mini bam with reads in chr6 only.

oneillkza commented 1 year ago

We've also noticed that Clair3 tends to just skip contigs if it encounters any errors with them. e.g. we had a couple of samples in a large run where there was network instability while they were running, which caused about half of the contigs to fail because the subprocesses working on them couldn't read them.

zhengzhenxian commented 1 year ago

@mproberts99,

Thank you for providing the log and data. We noticed that all the reads in your chr6 BAM file have a mapping quality of 1. Those low-MQ reads(MQ < 5) were filtered by Clair3 and no candidate was outputted. The variants present in the low-MQ reads are likely to be false variants caused by alignment artifacts.