chanzuckerberg / idseq-workflows

Portable WDL workflows for IDseq production pipelines
https://idseq.net/
MIT License
31 stars 12 forks source link

possible issue with INDEL calling by samtools #97

Open danrlu opened 3 years ago

danrlu commented 3 years ago

We recently discovered a bug in our code that by default we were leaving out most of the INDELs in vcf.

I see idseq is not enforcing the samtools version, so I'm not sure whether it would be the case for the version that is used here. FYR

kislyuk commented 3 years ago

Hi @danrlu, thanks for letting us know. IDseq does indeed set a specific samtools version, by installing the version that ships with Ubuntu 20.04 LTS in the Dockerfile for the workflow (https://github.com/chanzuckerberg/idseq-workflows/blob/main/consensus-genome/Dockerfile). The specific version can be found at https://packages.ubuntu.com/focal/samtools - it's version 1.10.

It's not clear to me from reading https://github.com/czbiohub/sc2-illumina-pipeline/issues/80 and https://github.com/czbiohub/sc2-illumina-pipeline/pull/81 which samtools versions have the issue you refer to, and whether we need to add a custom -L setting when calling samtools mpileup. Can you clarify? cc @katrinakalantar

danrlu commented 3 years ago

Oh I didn't know the version can be specified this way, very good to learn! Sorry to leave out the critical piece of info... our samtools is 1.9. mpileup is a samtools function that is now moved into bcftools, and I tested samtools 1.9 and bcftools 1.11, both of which need the -L flag. I passed all related info and files to Katrina.