pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
779 stars 274 forks source link

Pysam.TabixFile not recognising S3 URL #1249

Open Mathew-B-SDGS opened 11 months ago

Mathew-B-SDGS commented 11 months ago

Accessing S3 Gnomad Data whilst using Pysam Version 0.22.0. Received Error: OSError: file "https://gnomad-public-us-east-1.s3.amazonaws.com/release/4.0/vcf/genomes/gnomad.genomes.v4.0.sites.chr1.vcf.bgz" not found

Uninstalled and changed to Pysam Version 0.20.0 and code worked as intended. Url was tested with Curl, only changed the version of pysam used. (code run within clean venv on Linux minty)

Below is error Message received with Pysam 0.22.0

vcf = pysam.TabixFile(https://gnomad-public-us-east-1.s3.amazonaws.com/release/4.0/vcf/genomes/gnomad.genomes.v4.0.sites.chr1.vcf.bgz)

vcf = pysam.TabixFile(
  File "pysam/libctabix.pyx", line 349, in pysam.libctabix.TabixFile.__cinit__
  File "pysam/libctabix.pyx", line 378, in pysam.libctabix.TabixFile._open
OSError: file `https://gnomad-public-us-east-1.s3.amazonaws.com/release/4.0/vcf/genomes/gnomad.genomes.v4.0.sites.chr1.vcf.bgz` not found
jmarshall commented 11 months ago

You don't say how you installed pysam in each case — via conda? Or pip? Using wheels?