pysam-developers / pysam

Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, bcftools, and tabix.
https://pysam.readthedocs.io/en/latest/
MIT License
773 stars 274 forks source link

In pysam 0.22.0, AlignmentFile(REMOTE_BAM) reports SSL CA cert error with remote bam files #1268

Closed litaifang closed 3 months ago

litaifang commented 6 months ago

Hi,

In pysam v0.22.0 (but not in 0.21.0), pysam.AlignmentFile(REMOTE_BAM_FILE) returns with the following error messages. Not sure what the issue is, but those errors go away when I use 0.21.0. This seems to happen for bam files on GCS, HTTP, or HTTPS.

[E::easy_errno] Libcurl reported error 77 (Problem with the SSL CA cert (path? access rights?))
[E::hts_open_format] Failed to open file "https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/data/WGS/WGS_FD_N_2.bwa.dedup.bam" : Input/output error
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[15], line 1
----> 1 pysam.AlignmentFile(
      2     "https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/data/WGS/WGS_FD_N_2.bwa.dedup.bam"
      3 )

File /site-packages/pysam/libcalignmentfile.pyx:748, in pysam.libcalignmentfile.AlignmentFile.__cinit__()
File /site-packages/pysam/libcalignmentfile.pyx:947, in pysam.libcalignmentfile.AlignmentFile._open()

OSError: [Errno 5] could not open alignment file `[https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/data/WGS/WGS_FD_N_2.bwa.dedup.bam`](https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG/data/WGS/WGS_FD_N_2.bwa.dedup.bam%60): Input/output error
jmarshall commented 3 months ago

Thanks for the additional report. This problem exists when using a pre-built pysam wheel on Debian or Ubuntu. The problem is that the underlying libcurl is looking for SSL CA certificates at a path that does not exist on your machine. This can be fixed by telling it where to look; e.g., on Debian/Ubuntu this will probably be

export CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

See https://github.com/pysam-developers/pysam/issues/1257#issuecomment-2102480113 for details.

litaifang commented 3 months ago

Thanks. Also was able to do os.environ["CURL_CA_BUNDLE"] = "/etc/ssl/certs/ca-certificates.crt" in a python script.