pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
773 stars 274 forks source link

Error running `bcftools plugin` #1287

Open awgymer opened 3 months ago

awgymer commented 3 months ago

Trying to run bcftools plugin via the pysam.bcftools API results in the following error:

pysam.bcftools.plugin('-l')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/me/.pyenv/versions/venv/lib/python3.10/site-packages/pysam/utils.py", line 83, in __call__
    raise SamtoolsError(
pysam.utils.SamtoolsError: "bcftools returned with error 1: stdout=, stderr=[E::bcftools_main] unrecognized command 'plugin'\n"

I am running pysam==0.21.0

jmarshall commented 3 months ago

Thanks for the report. We would have to arrange for the bcftools code within pysam to be built with ENABLE_BCF_PLUGINS to include this code.

Also the bcftools plugins are not being built — in fact, their source code is currently omitted from the pysam repository. It might be best to leave it that way and use a separate bcftools installation's plugins via $BCFTOOLS_PLUGINS etc.

awgymer commented 3 months ago

I see. That feels like a somewhat odd choice given that the plugins usually come with the default distribution of bcftools?

If using $BCFTOOLS_PLUGINS to point to another install's plugins I think at that point it's may be cleaner to install with an external htslib as documented here? (although I am not sure on how tricky the issues of pointing to the correct libhts.so can be, having not done that before)

jmarshall commented 3 months ago

I don't see it as an odd choice. The goal of pysam is to provide convenient Python facilities, which has traditionally (and somewhat unfortunately!) included providing access to some samtools and bcftools commands. Providing built-in access to commands implemented as bcftools plugins is fairly far down the priority list… :shrug: the clue is in both parts of the name bcftools plugin… but enabling access to existing bcftools plugins installed from elsewhere would be more feasible.

Bundling and distributing bcftools plugins would bring its own problems, so might be best avoided.

Re your second paragraph, note that bcftools plugins (which provide additional commands) are a separate matter from htslib plugins (which at present provide remote file access).

awgymer commented 3 months ago

It's not so much that I have a problem with the choice to not support plugin as default, but rather it seemed unintuitive to me, yes they are plugins but they are also part of the default bcftools distribution whenever I have installed them. From your comment I think perhaps you see access to samtools/bcftools as more of a side effect than a core functionality and from that perspective the choice makes more sense.

bcftools plugins are separate from htslib plugins but it was more me thinking if I am pointing to some plugin dir it's just easiest to ensure that all the libraries are on the same version if I use the whole external libs. But I may be misunderstanding how that works 🥲

jmarshall commented 3 months ago

Building the bundled subset of the bcftools source code without defining ENABLE_BCF_PLUGINS, which is what causes the “unrecognized command 'plugin'” error, was an unintentional oversight and will be fixed.