broadinstitute / SpliceAI-lookup

Website for checking SpliceAI and Pangolin scores:
https://spliceailookup.broadinstitute.org
MIT License
18 stars 7 forks source link

"ERROR: <class 'KeyError'>: '3 not in hg19.fa.'" #39

Closed Manuel-DominguezCBG closed 1 year ago

Manuel-DominguezCBG commented 1 year ago

I have installed and the API seems to work. However, when I query for a variant the API returns this

{"variant": "chr3-37035154-G-A", "hg": "37", "source": "spliceai", "error": "ERROR: <class 'KeyError'>: '3 not in /Users/jocotton/hg19.fa.'", "duration": "0:00:00.031399"}

This looks like a problem with the format of the fasta file of how it is read. I download both from here https://www.ncbi.nlm.nih.gov/genome/guide/human/

What could be the reason of the problem??

bw2 commented 1 year ago

This is probably because the fasta has "chr3", but the code is calling the chromosome "3".

bw2 commented 1 year ago

I just updated the download URL @ https://github.com/broadinstitute/SpliceAI-lookup/blob/master/README.md#local-install. Could you please try that hg19 fasta instead? (https://storage.cloud.google.com/gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta)

Manuel-DominguezCBG commented 1 year ago

API working!!

I will do some testing during the weekend and I will add some comments about some debugging I needed to do (minor changes) in a new comment and then I will close this issue

Thank you so much for your help

bw2 commented 1 year ago

That's great! You might be the first to have a local install. (I updated requirements.txt based on your previous issue)

Manuel-DominguezCBG commented 1 year ago

Yes, I noticed you have been updating requirements.txt Thanks for that

This is what I did to local install the API (I save it here for keeping records for me or for someone else that may find these steps usefull)

I use Conda to have Python3.6 I did this because I got problem regarding the incompatibilities between TensorFlow (I don't remember details now)

Then I followed the steps you have in the README.file. The first time I did this I needed to install some Python libraries but I have reinstalled it again on a second computer and I believe that your requiretments.txt contains everything needed.

I pasted here the libraries of my Conda env.

` packages in environment at /Users/jocotton/miniconda3/envs/lookup3.6:

Name Version Build Channel absl-py 1.4.0 pypi_0 pypi argcomplete 3.0.8 pypi_0 pypi argh 0.27.2 pypi_0 pypi astor 0.8.1 pypi_0 pypi async-timeout 4.0.2 pypi_0 pypi biopython 1.79 pypi_0 pypi ca-certificates 2023.5.7 h8857fd0_0 conda-forge cached-property 1.5.2 pypi_0 pypi certifi 2016.9.26 py36_0 conda-forge charset-normalizer 2.0.12 pypi_0 pypi click 8.0.4 pypi_0 pypi dataclasses 0.8 pypi_0 pypi feedparser 6.0.10 pypi_0 pypi flask 2.0.3 pypi_0 pypi flask-cors 3.0.10 pypi_0 pypi flask-talisman 1.0.0 pypi_0 pypi gast 0.5.4 pypi_0 pypi gffutils 0.11.1 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi grpcio 1.48.2 pypi_0 pypi gunicorn 20.1.0 pypi_0 pypi h5py 3.1.0 pypi_0 pypi idna 3.4 pypi_0 pypi importlib-metadata 4.8.3 pypi_0 pypi intervaltree 3.1.0 pypi_0 pypi iso8601 1.1.0 pypi_0 pypi itsdangerous 2.0.1 pypi_0 pypi jinja2 3.0.3 pypi_0 pypi keras 2.1.5 pypi_0 pypi keras-applications 1.0.8 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi libcxx 14.0.6 h9765a3e_0
libffi 3.3 hb1e8313_2
markdown 3.3.7 pypi_0 pypi markdown2 2.4.8 pypi_0 pypi markupsafe 2.0.1 pypi_0 pypi ncurses 6.4 hcec6c5f_0
numpy 1.19.5 pypi_0 pypi openssl 1.1.1t hfd90126_0 conda-forge packaging 21.3 pypi_0 pypi pandas 1.1.5 pypi_0 pypi pangolin 1.0.2 pypi_0 pypi pip 21.2.2 py36hecd8cb5_0
protobuf 3.19.6 pypi_0 pypi pyfaidx 0.7.1 pypi_0 pypi pyfastx 1.1.0 pypi_0 pypi pyparsing 3.0.9 pypi_0 pypi pysam 0.21.0 pypi_0 pypi python 3.6.13 h88f2d9e_0
python-dateutil 2.8.2 pypi_0 pypi python_abi 3.6 2_cp36m conda-forge pytz 2023.3 pypi_0 pypi pyvcf 0.4.3 pypi_0 pypi pyyaml 6.0 pypi_0 pypi reader 1.18 pypi_0 pypi readline 8.2 hca72f7f_0
redis 4.3.6 pypi_0 pypi requests 2.27.1 pypi_0 pypi scipy 1.5.4 pypi_0 pypi setuptools 58.0.4 py36hecd8cb5_0
sgmllib3k 1.0.0 pypi_0 pypi simplejson 3.19.1 pypi_0 pypi six 1.16.0 pypi_0 pypi sortedcontainers 2.4.0 pypi_0 pypi spliceai 1.3.2 pypi_0 pypi sqlite 3.41.2 h6c40b1e_0
tensorboard 1.14.0 pypi_0 pypi tensorflow 1.14.0 pypi_0 pypi tensorflow-estimator 1.14.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tk 8.6.12 h5d9f67b_0
torch 1.10.2 pypi_0 pypi typing-extensions 4.1.1 pypi_0 pypi urllib3 1.26.16 pypi_0 pypi werkzeug 2.0.3 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0
wrapt 1.15.0 pypi_0 pypi xz 5.4.2 h6c40b1e_0
zipp 3.6.0 pypi_0 pypi zlib 1.2.13 h4dc903c_0
`

Finally, I found the followig error several times

AttributeError: 'str' object has no attribute 'decode'

I solved the problem deleting the .decode('utf-8').

Finally the problem with the unrecognized chromosome due to the problem with the fasta file.

and that is all. This is working.

I have a question but it is not related so I am going to close this issue and ask you that in a new one.

Thanks!

By the way, if you see something wrong in the libraries installed, let me know by reopening the issue.