pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
785 stars 273 forks source link

How to access the samples in the docs? #1087

Open aiqc opened 2 years ago

aiqc commented 2 years ago

I want to get a feel for the library, but can't follow along w the docs because I don't have dummy data.

>>> from pysam import VariantFile
>>> 
>>> 
>>> bcf_in = VariantFile("test.bcf") 
[E::hts_open_format] Failed to open file "test.bcf" : No such file or directory
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pysam/libcbcf.pyx", line 4054, in pysam.libcbcf.VariantFile.__init__
  File "pysam/libcbcf.pyx", line 4279, in pysam.libcbcf.VariantFile.open
FileNotFoundError: [Errno 2] could not open variant file `b'test.bcf'`: No such file or directory

So I took a look in https://github.com/pysam-developers/pysam/tree/master/tests and tried to access some files remotely.

>>> bcf_in = VariantFile("https://raw.githubusercontent.com/pysam-developers/pysam/master/tests/cbcf_data/example_vcf40.vcf") 
>>> 
>>> 
>>> bcf_in
<pysam.libcbcf.VariantFile object at 0x1036e0230>
>>> 
>>> 
>>> for rec in bcf_in.fetch('chr1', 1, 2):
    bcf_out.write(rec)
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pysam/libcbcf.pyx", line 4401, in pysam.libcbcf.VariantFile.fetch
ValueError: fetch requires an index
AndreasHeger commented 2 years ago

Hi, we have put only the text formatted data into the repository. Other formats and indices are created using the Makefiles in the test/*data directories:

make -C tests/pysam_data/Makefile
make -C tests/cbcf_data/Makefile
make -C tests/tabix_data/Makefile

(These commands are executed while the test suite is started).