Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
I've not been able to understand from the documentation the precise differences in behavior between the steppers 'all' and 'samtools' in their default settings, which by the docs seem to indicate that they have the same filters engaged (BAM_FUNMAP | BAM_FSECONDARY | BAM_FQCFAIL | BAM_FDUP). Printed also base quality which is above samtools default minimum of 13
Here's a sample code in which different results are obtained from both:
(pysam version 0.21.0, Python 3.6.9, Ubuntu 18.04)
ubuntu@ip-XXX~$ python3 stepper.py
stepper all
query_name: SRR099961.15186690 base_now: A mq_now: 18 bq_now: 21
query_name: SRR099961.45087351 base_now: A mq_now: 36 bq_now: 67
query_name: SRR099961.113296055 base_now: A mq_now: 14 bq_now: 31
stepper samtools
query_name: SRR099961.15186690 base_now: A mq_now: 18 bq_now: 21
query_name: SRR099961.45087351 base_now: A mq_now: 36 bq_now: 67
ubuntu@ip-XXX~$
**
PS another question- the wording of the documentation is that for stepper samtools, min_base_quality is defined as all greater or equal and min_mapping_quality is defined as only greater- is this the correct behavior?
from the docs:
min_base_quality (int) – Minimum base quality. Bases below the minimum quality will not be output. The default is 13.
adjust_capq_threshold (int) – adjust mapping quality. The default is 0 for no adjustment. The recommended value for adjustment is 50.
min_mapping_quality (int) – only use reads above a minimum mapping quality. The default is 0.
Dear all,
Thanks dearly for this invaluable library.
I've not been able to understand from the documentation the precise differences in behavior between the steppers 'all' and 'samtools' in their default settings, which by the docs seem to indicate that they have the same filters engaged (BAM_FUNMAP | BAM_FSECONDARY | BAM_FQCFAIL | BAM_FDUP). Printed also base quality which is above samtools default minimum of 13
Here's a sample code in which different results are obtained from both: (pysam version 0.21.0, Python 3.6.9, Ubuntu 18.04)
output:
ubuntu@ip-XXX~$ python3 stepper.py stepper all query_name: SRR099961.15186690 base_now: A mq_now: 18 bq_now: 21 query_name: SRR099961.45087351 base_now: A mq_now: 36 bq_now: 67 query_name: SRR099961.113296055 base_now: A mq_now: 14 bq_now: 31 stepper samtools query_name: SRR099961.15186690 base_now: A mq_now: 18 bq_now: 21 query_name: SRR099961.45087351 base_now: A mq_now: 36 bq_now: 67 ubuntu@ip-XXX~$
**
PS another question- the wording of the documentation is that for stepper samtools, min_base_quality is defined as all greater or equal and min_mapping_quality is defined as only greater- is this the correct behavior? from the docs:
min_base_quality (int) – Minimum base quality. Bases below the minimum quality will not be output. The default is 13. adjust_capq_threshold (int) – adjust mapping quality. The default is 0 for no adjustment. The recommended value for adjustment is 50. min_mapping_quality (int) – only use reads above a minimum mapping quality. The default is 0.