Closed Manuel-DominguezCBG closed 1 year ago
This is probably because the fasta has "chr3", but the code is calling the chromosome "3".
I just updated the download URL @ https://github.com/broadinstitute/SpliceAI-lookup/blob/master/README.md#local-install. Could you please try that hg19 fasta instead? (https://storage.cloud.google.com/gcp-public-data--broad-references/hg19/v0/Homo_sapiens_assembly19.fasta)
API working!!
I will do some testing during the weekend and I will add some comments about some debugging I needed to do (minor changes) in a new comment and then I will close this issue
Thank you so much for your help
That's great! You might be the first to have a local install. (I updated requirements.txt based on your previous issue)
Yes, I noticed you have been updating requirements.txt Thanks for that
This is what I did to local install the API (I save it here for keeping records for me or for someone else that may find these steps usefull)
I use Conda to have Python3.6 I did this because I got problem regarding the incompatibilities between TensorFlow (I don't remember details now)
Then I followed the steps you have in the README.file. The first time I did this I needed to install some Python libraries but I have reinstalled it again on a second computer and I believe that your requiretments.txt contains everything needed.
I pasted here the libraries of my Conda env.
` packages in environment at /Users/jocotton/miniconda3/envs/lookup3.6:
Name Version Build Channel
absl-py 1.4.0 pypi_0 pypi
argcomplete 3.0.8 pypi_0 pypi
argh 0.27.2 pypi_0 pypi
astor 0.8.1 pypi_0 pypi
async-timeout 4.0.2 pypi_0 pypi
biopython 1.79 pypi_0 pypi
ca-certificates 2023.5.7 h8857fd0_0 conda-forge
cached-property 1.5.2 pypi_0 pypi
certifi 2016.9.26 py36_0 conda-forge
charset-normalizer 2.0.12 pypi_0 pypi
click 8.0.4 pypi_0 pypi
dataclasses 0.8 pypi_0 pypi
feedparser 6.0.10 pypi_0 pypi
flask 2.0.3 pypi_0 pypi
flask-cors 3.0.10 pypi_0 pypi
flask-talisman 1.0.0 pypi_0 pypi
gast 0.5.4 pypi_0 pypi
gffutils 0.11.1 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.48.2 pypi_0 pypi
gunicorn 20.1.0 pypi_0 pypi
h5py 3.1.0 pypi_0 pypi
idna 3.4 pypi_0 pypi
importlib-metadata 4.8.3 pypi_0 pypi
intervaltree 3.1.0 pypi_0 pypi
iso8601 1.1.0 pypi_0 pypi
itsdangerous 2.0.1 pypi_0 pypi
jinja2 3.0.3 pypi_0 pypi
keras 2.1.5 pypi_0 pypi
keras-applications 1.0.8 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
libcxx 14.0.6 h9765a3e_0
libffi 3.3 hb1e8313_2
markdown 3.3.7 pypi_0 pypi
markdown2 2.4.8 pypi_0 pypi
markupsafe 2.0.1 pypi_0 pypi
ncurses 6.4 hcec6c5f_0
numpy 1.19.5 pypi_0 pypi
openssl 1.1.1t hfd90126_0 conda-forge
packaging 21.3 pypi_0 pypi
pandas 1.1.5 pypi_0 pypi
pangolin 1.0.2 pypi_0 pypi
pip 21.2.2 py36hecd8cb5_0
protobuf 3.19.6 pypi_0 pypi
pyfaidx 0.7.1 pypi_0 pypi
pyfastx 1.1.0 pypi_0 pypi
pyparsing 3.0.9 pypi_0 pypi
pysam 0.21.0 pypi_0 pypi
python 3.6.13 h88f2d9e_0
python-dateutil 2.8.2 pypi_0 pypi
python_abi 3.6 2_cp36m conda-forge
pytz 2023.3 pypi_0 pypi
pyvcf 0.4.3 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
reader 1.18 pypi_0 pypi
readline 8.2 hca72f7f_0
redis 4.3.6 pypi_0 pypi
requests 2.27.1 pypi_0 pypi
scipy 1.5.4 pypi_0 pypi
setuptools 58.0.4 py36hecd8cb5_0
sgmllib3k 1.0.0 pypi_0 pypi
simplejson 3.19.1 pypi_0 pypi
six 1.16.0 pypi_0 pypi
sortedcontainers 2.4.0 pypi_0 pypi
spliceai 1.3.2 pypi_0 pypi
sqlite 3.41.2 h6c40b1e_0
tensorboard 1.14.0 pypi_0 pypi
tensorflow 1.14.0 pypi_0 pypi
tensorflow-estimator 1.14.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
tk 8.6.12 h5d9f67b_0
torch 1.10.2 pypi_0 pypi
typing-extensions 4.1.1 pypi_0 pypi
urllib3 1.26.16 pypi_0 pypi
werkzeug 2.0.3 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0
wrapt 1.15.0 pypi_0 pypi
xz 5.4.2 h6c40b1e_0
zipp 3.6.0 pypi_0 pypi
zlib 1.2.13 h4dc903c_0
`
Finally, I found the followig error several times
AttributeError: 'str' object has no attribute 'decode'
I solved the problem deleting the .decode('utf-8').
Finally the problem with the unrecognized chromosome due to the problem with the fasta file.
and that is all. This is working.
I have a question but it is not related so I am going to close this issue and ask you that in a new one.
Thanks!
By the way, if you see something wrong in the libraries installed, let me know by reopening the issue.
I have installed and the API seems to work. However, when I query for a variant the API returns this
{"variant": "chr3-37035154-G-A", "hg": "37", "source": "spliceai", "error": "ERROR: <class 'KeyError'>: '3 not in /Users/jocotton/hg19.fa.'", "duration": "0:00:00.031399"}
This looks like a problem with the format of the fasta file of how it is read. I download both from here https://www.ncbi.nlm.nih.gov/genome/guide/human/
What could be the reason of the problem??