maickrau / GraphAligner

MIT License
256 stars 30 forks source link

Alignment Termination/Soft Clipping #28

Closed schorlton closed 3 years ago

schorlton commented 3 years ago

Thanks for your amazing tool! I'm struggling to understand and adjust the soft clipping behavior of GraphAligner - is this possible?

I'm including an example Example.tar.gz

This is amplicon sequencing of a virus aligned with minimap2 and GraphAligner to the same reference.

minimap2 -ax map-ont CMV.reference.fasta read.fastq | samtools sort -o minimap2.bam

vg construct -r CMV.reference.fasta -m 1000000000 > CMV.reference.vg
GraphAligner -g CMV.reference.vg -f read.fastq -x vg -a GraphAligner.gam
vg surject -b GraphAligner.gam -x CMV.reference.vg | samtools sort -o GraphAligner.bam

Minimap2 correctly soft clips the alignment at the end of the amplicon primer, GraphAligner extends until the end of the read which is noise. The same thing happens for all of the reads at this amplicon. I've surjected the GAM for easier visualization: image

Thanks for your help!

maickrau commented 3 years ago

Hi, the current version in Bioconda does not have a parameter for this, but the version in master branch has the option "--precise-clipping \<identity>" for this case. The option will clip alignment ends which have an identity lower than the given parameter, so for example "--precise-clipping 0.8" will clip alignment ends whose identity is less than 80%.

schorlton commented 3 years ago

Awesome, thanks! Any plans to tag a new release? I notice that you've made some great changes since your last official version.

maickrau commented 3 years ago

The new clipping behaviour is now on by default in 1.0.13.