pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
774 stars 274 forks source link

pysam consensus options #1209

Closed noamprywes closed 1 year ago

noamprywes commented 1 year ago

Hi, I love pysam and I want to use it rather than samtools but there's one thing I can't figure out. I'm using the pysam.consensus option and I want to adjust some presets. It works fine when I pass in options like -l and --het-scale but it doesn't work for --homopoly-fix. Is there a reason that I don't understand? Also '--qual-calibration', ':hifi' doesn't work, is it possible to make it work?

Thanks, Noam

jmarshall commented 1 year ago

The following worked as I expected, communicating the options to the samtools code:

import pysam
contigs = pysam.consensus('--het-scale', '0.5', '--homopoly-fix',
                          '--qual-calibration', ':hifi', 'tests/pysam_data/ex1.bam')

Please show us the code you tried, and what output and/or errors you received.

noamprywes commented 1 year ago

Thanks so much for taking a look. I think I might just be confused. It works fine with '--homopoly-fix' but if I try

'--homopoly-fix', '0.3' it doesn't work. Is this a difference between samtools and pysam? In this page it looks like there shouldn't be a float there but the examples at the bottom seem to have 0.3 specified. Is that supposed to be a homopoly score?

http://www.htslib.org/doc/samtools-consensus.html

Everything works now. Thanks!

jmarshall commented 1 year ago

For this and most other commands (the only real exception is index), pysam is just a wrapper and doesn't itself understand anything about these options. So there's no difference between samtools and pysam in this area.

In samtools's consensus command, -p/--homopoly-fix has no argument and --homopoly-score FLOAT has a required float argument.

The -X/--config table entries at the bottom of the consensus man page are thus incorrect. I believe they should indeed be … --homopoly-score 0.3 …. (The --homopoly-score paragraph also talks about the required FLOAT argument having a default, which is confusingly written IMHO.)

Please report this to samtools as a documentation bug.