ncbi / ACTIVTRACEvariants

Other
1 stars 4 forks source link

Masking of Homopolymer region changes Nextclade Variant assignment #3

Open Rohit-Satyam opened 2 years ago

Rohit-Satyam commented 2 years ago

Hi Authors @edwelker @kmt @rkhaja @vadimzalunin @nawrockie

While reading your paper, I tested your suggestions for Variant filtering. It does improve assembly coverage slightly. However, when I try to exclude variants in Homopolymer regions as you suggested, a lot of variants gets filtered out and 22B Omicron assembly becomes 19A. This is not desirable

Note: These assemblies are generated from wf-artic pipeline using ONT Minion technologies. We observed in each sample 98-99% Variants passed variant filtering criteria of AF>=0.5 ADP>=50 AF>100. Therefore filtering out such variants just because they lie in same region as homopolymer migh give us erroneous results, I think.

Pre filtering the variants in homopolymer regions image

Post filtering variants in homopolymer regions image

Below present is the homopolymer BED (generated using your code) file and the script that I am using

homopolymer.csv reviseAssembly.zip

Rohit-Satyam commented 2 years ago

Hi Do you have any insights?

Rohit-Satyam commented 1 year ago

Hi @edwelker @kmt @rkhaja @vadimzalunin can you answer my query. I also observed that using just AF>=0.5 ADP>=50 AF>100 filters some reversions as per NextClade and changes clade assignment in some samples.

Rohit-Satyam commented 1 year ago

Hi @edwelker @kmt @rkhaja @vadimzalunin

Can anyone resolve this query?