ccmbioinfo / seqr

web-based analysis tool for rare disease genomics. Forked from Broad Institute, hosted for CCM
https://seqr.ccm.sickkids.ca/
GNU Affero General Public License v3.0
0 stars 0 forks source link

Cannot fetch reference files for IGV panels #1

Open mike8115 opened 3 weeks ago

mike8115 commented 3 weeks ago

Summary

Users cannot view reads. The IGV panel opens, but the tracks do not load. Inspecting the browser console reveals that an HTTP request for reference files returns with a 403 error.

The HTTP request seems to be a GET request to seqr's IGV API. Found the specific function that is running to get the reference files from a remote server.

https://github.com/ccmbioinfo/seqr/blob/917c6d1a3df9eb2e0f3b2484da06f32b0693cd78/seqr/views/apis/igv_api.py#L205-L217

Offending resource

https://seqr.ccm.sickkids.ca/api/igv_genomes/broadinstitute.org/genomes/seq/hg38/hg38.fa.fai

mike8115 commented 3 weeks ago

The GET request calls a function that ultimately makes another request to https://s3.amazonaws.com/igv.broadinstitute.org/genomes/seq/hg38/hg38.fa.fai. A GET request to this URL is what causes the 403 error.

Oddly enough, visiting the URL with a browser works fine. This leads me to suspect that there's something with the GET request that is different between seqR and my browser.

Trial and error reveal that adding a valid User-Agent header allows the request to succeed. Not entirely sure why. However, adding the following line before submitting the request should be good:

headers["User-Agent"] = "Mozilla/5.0 (X11; Linux x86_64; rv:130.0) Gecko/20100101 Firefox/130.0"

Need some investigation to determine the implications of adding this header. But I've applied this hotfix to the production container (not on the actual file on the VM).