Closed DanielAndreasen closed 3 years ago
Are you able to access the hg38.fa file if you directly enter the link into your web browser?
Yes I am. I just tried running the command again, with the same error. I'm running from the newest master branch.
That is strange. And there's nothing about the environment where you're running create_reports that would prevent access to the file at Amazon? Since you do have access to the file via the browser, one workaround would be to download it and just use the local version.
In that case I suggest you download the fasta (and its associated index) and use them as local files. I can't reproduce your problem here. To run the example you referenced download https://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa and https://s3.dualstack.us-east-1.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa.fai. Save them locally and use the full path to the fasta file.
On Tue, Oct 6, 2020 at 11:47 PM Daniel Thaagaard Andreasen < notifications@github.com> wrote:
Yes I am. I just tried running the command again, with the same error. I'm running from the newest master branch.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/igvteam/igv-reports/issues/45#issuecomment-704732155, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHD2HCQP273WRXK3P3UPLTSJQFHDANCNFSM4SF3LZOQ .
The other thing you might try, python being python, is to create and install to a fresh environment. You might have an old version of pysam that is not recognizing urls.
I'm closing this as it can't be reproduced.
Sorry to see this issue closed so fast.
I tried making a new conda environment, installed pip
, and then igv-reports
. I'm using python 3.8.6, and pysam 0.16.0.1 (the latest version as of right now), and still it doesn't work.
However, if I download the reference genome and its index I can make it run.
Just out of curiousity, which versions of python and pysam are you using?
I'll re-open if you describe a problem I can reproduce. Closing just takes it off my active list, I'm not meaning to preclude discussion and questions. I use 3.7.2 with this project.
And which version of pysam do you use?
We have tried 0.15.3 and 0.16.0.1
I am getting the same error. You can replicate it with this Dockerfile;
FROM continuumio/miniconda3:4.5.4
RUN conda install bioconda::pysam==0.15.3
# need to install igv-reports from Git because the pip version is out dated and lacks some critical bug fixes; https://github.com/igvteam/igv-reports/issues/47
RUN git clone https://github.com/igvteam/igv-reports.git && \
cd igv-reports && \
git checkout 7e12305 && \
pip install -r requirements.txt && \
python setup.py install
ADD test.sh /test.sh
with this test script test.sh
;
#!/bin/bash
set -x
create_report \
/igv-reports/examples/variants/variants.vcf.gz \
https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa \
--ideogram examples/variants/cytoBandIdeo.txt \
--flanking 1000 \
--info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC \
--tracks \
/igv-reports/examples/variants/variants.vcf.gz \
/igv-reports/examples/variants/recalibrated.bam \
/igv-reports/examples/variants/refGene.sort.bed.gz \
--output igvjs_viewer.test.html
Running it;
$ docker run --rm -it igv-reports-1.0.1 bash
root@729dc29e270b:/# ./test.sh
+ create_report /igv-reports/examples/variants/variants.vcf.gz https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC --tracks /igv-reports/examples/variants/variants.vcf.gz /igv-reports/examples/variants/recalibrated.bam /igv-reports/examples/variants/refGene.sort.bed.gz --output igvjs_viewer.test.html
Traceback (most recent call last):
File "/opt/conda/bin/create_report", line 33, in <module>
sys.exit(load_entry_point('igv-reports==1.0.1', 'console_scripts', 'create_report')())
File "/opt/conda/lib/python3.7/site-packages/igv_reports-1.0.1-py3.7.egg/igv_reports/report.py", line 234, in main
create_report(args)
File "/opt/conda/lib/python3.7/site-packages/igv_reports-1.0.1-py3.7.egg/igv_reports/report.py", line 87, in create_report
data = fasta.get_data(args.fasta, region)
File "/opt/conda/lib/python3.7/site-packages/igv_reports-1.0.1-py3.7.egg/igv_reports/fasta.py", line 21, in get_data
fasta = pysam.FastaFile(fasta_file)
File "pysam/libcfaidx.pyx", line 123, in pysam.libcfaidx.FastaFile.__cinit__
File "pysam/libcfaidx.pyx", line 155, in pysam.libcfaidx.FastaFile._open
OSError: file `https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa` not found
note that I am using a git clone
of the repo in there and not installing from pip
for the reason mentioned in the comments.
I also get the same error if I install igv-reports
from pip
;
Dockerfile;
FROM continuumio/miniconda3:4.5.4
RUN conda install bioconda::pysam==0.15.3 conda-forge::unzip
RUN pip install igv-reports
RUN wget https://s3.amazonaws.com/igv.org.test/reports/examples.zip && unzip examples.zip
ADD test.sh /test.sh
test.sh
#!/bin/bash
set -x
create_report \
/examples/variants/variants.vcf.gz \
https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa \
--ideogram examples/variants/cytoBandIdeo.txt \
--flanking 1000 \
--info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC \
--tracks \
/examples/variants/variants.vcf.gz \
/examples/variants/recalibrated.bam \
/examples/variants/refGene.sort.bed.gz \
--output igvjs_viewer.test.html
result;
$ docker run --rm -it igv-reports-1.0.1 bash
root@342a8774e36b:/# ./test.sh
+ create_report /examples/variants/variants.vcf.gz https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa --ideogram examples/variants/cytoBandIdeo.txt --flanking 1000 --info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC --tracks /examples/variants/variants.vcf.gz /examples/variants/recalibrated.bam /examples/variants/refGene.sort.bed.gz --output igvjs_viewer.test.html
Traceback (most recent call last):
File "/opt/conda/bin/create_report", line 8, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/igv_reports/report.py", line 234, in main
create_report(args)
File "/opt/conda/lib/python3.7/site-packages/igv_reports/report.py", line 87, in create_report
data = fasta.get_data(args.fasta, region)
File "/opt/conda/lib/python3.7/site-packages/igv_reports/fasta.py", line 21, in get_data
fasta = pysam.FastaFile(fasta_file)
File "pysam/libcfaidx.pyx", line 123, in pysam.libcfaidx.FastaFile.__cinit__
File "pysam/libcfaidx.pyx", line 155, in pysam.libcfaidx.FastaFile._open
OSError: file `https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa` not found
root@342a8774e36b:/# ls -l /examples/variants/variants.vcf.gz
-rw-r--r-- 1 root root 7329 Jun 15 2020 /examples/variants/variants.vcf.gz
root@342a8774e36b:/# pip freeze
certifi==2020.12.5
chardet==4.0.0
idna==2.10
igv-reports==1.0.1
intervaltree==3.1.0
pysam==0.15.3
requests==2.25.1
sortedcontainers==2.3.0
urllib3==1.26.3
@stevekm OK I will look into it. The examples should work of course, but as a workaround you can download that fasta and reference it as a local file.
Yes I tried that and it works. The issue is that I want to be able to include a test script like this with my container builds in order to test that its working, in which case its very helpful to be able to load the reference genome from the URL as shown. Hope there's a solution possible :)
@stevekm I will try to get to this next week, this is not a very active project compared to IGV and igv.js, plus its a different language (python), I will have to clear some time. However I do recall trying to reproduce this before without success. I do not use Docker, I don't know what affect that would have but its the common thread between the OP and your report.
on a side note, I found out that my conda installation in those Dockerfiles was slightly wrong, it should be this;
RUN conda install python=3.6.5 bioconda::pysam==0.15.3
to avoid the error described here where upgrading the base Python version breaks conda; https://stackoverflow.com/questions/19825250/after-anaconda-installation-conda-command-fails-with-importerror-no-module-na
this does not change the error with pysam
seen here but does get in the way of trying to debug it
@stevekm I just pushed 1.0.2, in response to an earlier report from you that the PIP package was not in sync with github. There is some possibility, albeit slight, that this might resolve this issue.
Looking at the stack track the error is in pysam, which is trying to open the URL as a local file
File "pysam/libcfaidx.pyx", line 155, in pysam.libcfaidx.FastaFile._open
OSError: file `https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa` not found
Which in turn indicates an error in the htslib "isremote" function. I do not know why you would see this error, I do not see it with a clean pip install, but it has something to do with the pysam dependency.
Per notes above, this error was not reproducible but could have been caused by incorrect or out-of-date files pushed to pypi as part of release 1.0.1.
I ran into the same error with a Singularity image of igv-reports, with additional error information:
$ singularity exec -B $PWD singularity/igv-reports-1.0.4.sif create_report igv-reports/examples/variants/variants.vcf.gz https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa
[E::easy_errno] Libcurl reported error 77 (Problem with the SSL CA cert (path? access rights?))
[E::fai_load3_core] Failed to open FASTA index https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa.fai: Input/output error
Traceback (most recent call last):
File "/usr/local/bin/create_report", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.9/dist-packages/igv_reports/report.py", line 271, in main
create_report(args)
File "/usr/local/lib/python3.9/dist-packages/igv_reports/report.py", line 105, in create_report
data = fasta.get_data(args.fasta, region)
File "/usr/local/lib/python3.9/dist-packages/igv_reports/fasta.py", line 21, in get_data
fasta = pysam.FastaFile(fasta_file)
File "pysam/libcfaidx.pyx", line 123, in pysam.libcfaidx.FastaFile.__cinit__
File "pysam/libcfaidx.pyx", line 183, in pysam.libcfaidx.FastaFile._open
OSError: error when opening file `https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa`
I'm using pysam 0.18.0
This looks like a libcurl bug affecting SSL certificates, there is nothing wrong with the certificate for that file. A workaround is to use an http url (instead of https). Specifically
http://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa
I don't know anything about singularity, but here is a Colab notebook with that example (working). I will update the readme to use http: https://colab.research.google.com/drive/1JJvyDm0r_Lyhmuk27zEwfkE0J1gV0wzp?usp=sharing
Thank you, this workaround solved my issue.
When I run any of the two examples I get the following error: