Closed francoiskroll closed 6 months ago
Oh... Thank you for uncovering this critical issue.
It seems that the problem arose due to the lack of code to convert the input sequence to .upper()
.
It appears that the issue arises when the input sequence contains lowercase letters, causing the GC count feature to be improperly counted, resulting in unexpected outputs.
# genet/predict/PrimeEditor.py/DeepPrimeGuideRNA
'GC_count_PBS' : [pbs.count('G') + pbs.count('C')],
'GC_count_RTT' : [rtt.count('G') + rtt.count('C')],
'GC_count_RT-PBS' : [self.rtpbs.count('G') + self.rtpbs.count('C')],
Firstly, inputting all sequences in uppercase should yield the correct DeepPrime score. Could you please keep this issue open? I will work on a hot fix release as soon as possible to address the bug.
Thank you for finding and reporting this important bug!
P.S. I prefer this repo for discussing about GenET! Thank you.
I've fixed the bugs you reported and incorporated them into this version update. Thanks for bringing up important matters!
A few additional points:
If you have plans to test, please go ahead and let me know the results. I would appreciate it. If you don't have plans or if there are no issues even after checking, I'll close this issue.
PS. With this update, there have been significant changes in the input format of DeepPrime
(not DeepPrimeGuideRNA
). Please keep this in mind for future use!
Amazing! That input for genet 0.15.0 looks great.
I gave it a try but having some sort of installation issue, sorry! I am guessing related to #86. Here is what I tried:
In Terminal:
conda create -n deepprime2
conda activate deepprime2
conda config --env --set subdir osx-64
conda install python=3.10
pip install genet
It said it worked.
Then in a Jupyter Notebook:
from genet.predict import DeepPrime
This runs forever... e.g. print('hey')
works so I do not think it is a Python/Jupyter issue. I do not think it's the same as #84 as here it's getting stuck at import phase. Is it still missing some dependency?
I apologize for the inconvenience you’ve been experiencing. Let’s test the import of some key packages individually to identify any problems.
Can you please check each one separately to see if there are any issues?
import Bio
import RNA
import torch
import tensorflow
import silence_tensorflow
import genet.predict.PrimeEditor
Thank you for your patience! 🙏
No problem, happy to help.
tensorflow
& import genet.predict.PrimeEditor
keep going forever (well, I killed after 2 min). All the rest imports fine (import torch
took 30 sec but did finish).
Here is output from conda list
, if that's helpful:
# packages in environment at /Users/francoiskroll/miniconda3/envs/deepprime2:
#
# Name Version Build Channel
absl-py 2.1.0 pypi_0 pypi
appnope 0.1.4 pyhd8ed1ab_0 conda-forge
asttokens 2.4.1 pyhd8ed1ab_0 conda-forge
astunparse 1.6.3 pypi_0 pypi
biopython 1.83 pypi_0 pypi
bzip2 1.0.8 h6c40b1e_5
ca-certificates 2024.3.11 hecd8cb5_0
cachetools 5.3.3 pypi_0 pypi
certifi 2024.2.2 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
comm 0.2.2 pyhd8ed1ab_0 conda-forge
cramjam 2.8.3 pypi_0 pypi
debugpy 1.8.1 py310h5daac23_0 conda-forge
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
editdistance 0.8.1 pypi_0 pypi
exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge
executing 2.0.1 pyhd8ed1ab_0 conda-forge
fastparquet 2024.2.0 pypi_0 pypi
filelock 3.13.4 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
fsspec 2024.3.1 pypi_0 pypi
gast 0.4.0 pypi_0 pypi
genet 0.15.0 pypi_0 pypi
google-auth 2.29.0 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.62.2 pypi_0 pypi
h5py 3.11.0 pypi_0 pypi
idna 3.7 pypi_0 pypi
importlib-metadata 7.1.0 pyha770c72_0 conda-forge
importlib_metadata 7.1.0 hd8ed1ab_0 conda-forge
ipykernel 6.29.3 pyh3cd1d5f_0 conda-forge
ipython 8.22.2 pyh707e725_0 conda-forge
jedi 0.19.1 pyhd8ed1ab_0 conda-forge
jinja2 3.1.3 pypi_0 pypi
jupyter_client 8.6.1 pyhd8ed1ab_0 conda-forge
jupyter_core 5.7.2 py310h2ec42d9_0 conda-forge
keras 2.9.0 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
libclang 18.1.1 pypi_0 pypi
libcxx 16.0.6 hd57cbcb_0 conda-forge
libffi 3.4.4 hecd8cb5_0
libsodium 1.0.18 hbcb3906_1 conda-forge
markdown 3.6 pypi_0 pypi
markupsafe 2.1.5 pypi_0 pypi
matplotlib-inline 0.1.7 pyhd8ed1ab_0 conda-forge
mpmath 1.3.0 pypi_0 pypi
ncurses 6.4 hcec6c5f_0
nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge
networkx 3.3 pypi_0 pypi
numpy 1.26.4 pypi_0 pypi
oauthlib 3.2.2 pypi_0 pypi
openssl 3.2.1 hd75f5a5_1 conda-forge
opt-einsum 3.3.0 pypi_0 pypi
packaging 24.0 pyhd8ed1ab_0 conda-forge
pandas 2.2.2 pypi_0 pypi
parso 0.8.4 pyhd8ed1ab_0 conda-forge
pexpect 4.9.0 pyhd8ed1ab_0 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pip 23.3.1 py310hecd8cb5_0
platformdirs 4.2.1 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.42 pyha770c72_0 conda-forge
protobuf 3.19.6 pypi_0 pypi
psutil 5.9.8 py310hb372a2b_0 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pyarrow 16.0.0 pypi_0 pypi
pyasn1 0.6.0 pypi_0 pypi
pyasn1-modules 0.4.0 pypi_0 pypi
pygments 2.17.2 pyhd8ed1ab_0 conda-forge
python 3.10.14 h5ee71fb_0
python-dateutil 2.9.0.post0 pypi_0 pypi
python_abi 3.10 2_cp310 conda-forge
pytz 2024.1 pypi_0 pypi
pyzmq 26.0.2 py310hdd8d2da_0 conda-forge
readline 8.2 hca72f7f_0
regex 2024.4.16 pypi_0 pypi
requests 2.31.0 pypi_0 pypi
requests-oauthlib 2.0.0 pypi_0 pypi
rsa 4.9 pypi_0 pypi
setuptools 68.2.2 py310hecd8cb5_0
silence-tensorflow 1.2.1 pypi_0 pypi
six 1.16.0 pyh6c4a22f_0 conda-forge
sqlite 3.41.2 h6c40b1e_0
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
support-developer 1.0.5 pypi_0 pypi
sympy 1.12 pypi_0 pypi
tensorboard 2.9.1 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.9.3 pypi_0 pypi
tensorflow-estimator 2.9.0 pypi_0 pypi
tensorflow-io-gcs-filesystem 0.36.0 pypi_0 pypi
termcolor 2.4.0 pypi_0 pypi
tk 8.6.12 h5d9f67b_0
torch 2.2.2 pypi_0 pypi
tornado 6.4 py310hb372a2b_0 conda-forge
tqdm 4.66.2 pypi_0 pypi
traitlets 5.14.3 pyhd8ed1ab_0 conda-forge
typing_extensions 4.11.0 pyha770c72_0 conda-forge
tzdata 2024.1 pypi_0 pypi
urllib3 2.2.1 pypi_0 pypi
viennarna 2.6.4 pypi_0 pypi
wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge
werkzeug 3.0.2 pypi_0 pypi
wheel 0.41.2 py310hecd8cb5_0
wrapt 1.16.0 pypi_0 pypi
xz 5.4.6 h6c40b1e_0
zeromq 4.3.5 h93d8f39_0 conda-forge
zipp 3.17.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 h4dc903c_0
I did not change anything in that environment after my reply above with conda create -n deepprime2
etc.
The environment where I have genet 0.14.0
installed & working looks like:
# packages in environment at /Users/francoiskroll/miniconda3/envs/deepprime:
#
# Name Version Build Channel
abseil-cpp 20210324.2 h23ab428_0
absl-py 0.15.0 pyhd3eb1b0_0
aiohttp 3.9.3 py38h6c40b1e_0
aiosignal 1.2.0 pyhd3eb1b0_0
appnope 0.1.4 pyhd8ed1ab_0 conda-forge
asttokens 2.4.1 pyhd8ed1ab_0 conda-forge
astunparse 1.6.3 py_0
async-timeout 4.0.3 py38hecd8cb5_0
attrs 23.1.0 py38hecd8cb5_0
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
biopython 1.83 pypi_0 pypi
blas 1.0 mkl
blinker 1.6.2 py38hecd8cb5_0
brotli-python 1.0.9 py38he9d5cce_7
c-ares 1.19.1 h6c40b1e_0
ca-certificates 2024.3.11 hecd8cb5_0
cached-property 1.5.2 py_0
cachetools 4.2.2 pyhd3eb1b0_0
certifi 2024.2.2 pyhd8ed1ab_0 conda-forge
cffi 1.16.0 py38h6c40b1e_0
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.1.7 py38hecd8cb5_0
comm 0.2.2 pyhd8ed1ab_0 conda-forge
cramjam 2.8.3 pypi_0 pypi
cryptography 41.0.3 py38ha2381d6_0
debugpy 1.8.1 py38h1f5f77c_0 conda-forge
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
editdistance 0.8.1 pypi_0 pypi
executing 2.0.1 pyhd8ed1ab_0 conda-forge
fastparquet 0.8.3 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
frozenlist 1.4.0 py38h6c40b1e_0
fsspec 2024.3.1 pypi_0 pypi
gast 0.4.0 pyhd3eb1b0_0
genet 0.14.1 pypi_0 pypi
giflib 5.2.1 h6c40b1e_3
google-auth 2.6.0 pyhd3eb1b0_0
google-auth-oauthlib 0.4.4 pyhd3eb1b0_0
google-pasta 0.2.0 pyhd3eb1b0_0
grpc-cpp 1.39.1 h3acd2d4_1 conda-forge
grpcio 1.39.0 py38h4924b5d_0 conda-forge
h5py 3.1.0 nompi_py38h5142359_100 conda-forge
hdf5 1.10.6 h10fe05b_1
icu 68.1 h23ab428_0
idna 3.4 py38hecd8cb5_0
importlib-metadata 7.0.1 py38hecd8cb5_0
importlib_metadata 7.0.1 hd8ed1ab_0 conda-forge
intel-openmp 2023.1.0 ha357a0b_43548
ipykernel 6.29.3 pyh3cd1d5f_0 conda-forge
ipython 8.12.2 pyhd1c38e8_0 conda-forge
jedi 0.19.1 pyhd8ed1ab_0 conda-forge
jpeg 9e h6c40b1e_1
jupyter_client 8.6.1 pyhd8ed1ab_0 conda-forge
jupyter_core 5.7.2 py38h50d1736_0 conda-forge
keras 2.6.0 pyhd3eb1b0_0
keras-preprocessing 1.1.2 pyhd3eb1b0_0
krb5 1.20.1 hdba6334_1
libcurl 8.2.1 ha585b31_0
libcxx 16.0.6 hd57cbcb_0 conda-forge
libedit 3.1.20230828 h6c40b1e_0
libev 4.33 h9ed2024_1
libffi 3.4.4 hecd8cb5_0
libgfortran 5.0.0 11_3_0_hecd8cb5_28
libgfortran5 11.3.0 h9dfd629_28
libnghttp2 1.52.0 h1c88b7d_1
libpng 1.6.39 h6c40b1e_0
libprotobuf 3.16.0 hcf210ce_0 conda-forge
libsodium 1.0.18 hbcb3906_1 conda-forge
libssh2 1.10.0 hdb2fb19_2
libzlib 1.2.13 h8a1eda9_5 conda-forge
llvm-openmp 18.1.3 hb6ac08f_0 conda-forge
markdown 3.4.1 py38hecd8cb5_0
markupsafe 2.1.3 py38h6c40b1e_0
matplotlib-inline 0.1.7 pyhd8ed1ab_0 conda-forge
mkl 2023.1.0 h8e150cf_43560
mkl-service 2.4.0 py38h6c40b1e_1
mkl_fft 1.3.8 py38h6c40b1e_0
mkl_random 1.2.4 py38ha357a0b_0
multidict 6.0.4 py38h6c40b1e_0
ncurses 6.4 hcec6c5f_0
nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge
numpy 1.19.5 py38h3cdbb29_5
numpy-base 1.19.5 py38hff596df_5
oauthlib 3.2.2 py38hecd8cb5_0
openssl 1.1.1w h8a1eda9_0 conda-forge
opt_einsum 3.3.0 pyhd3eb1b0_1
packaging 23.2 py38hecd8cb5_0
pandas 1.4.4 pypi_0 pypi
parso 0.8.4 pyhd8ed1ab_0 conda-forge
perl 5.32.1 0_h435f0c2_perl5
pexpect 4.9.0 pyhd8ed1ab_0 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pip 23.3.1 py38hecd8cb5_0
platformdirs 3.10.0 py38hecd8cb5_0
pooch 1.7.0 py38hecd8cb5_0
prompt-toolkit 3.0.42 pyha770c72_0 conda-forge
prompt_toolkit 3.0.42 hd8ed1ab_0 conda-forge
protobuf 3.16.0 py38ha048514_0 conda-forge
psutil 5.9.8 py38hae2e43d_0 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pyarrow 15.0.2 pypi_0 pypi
pyasn1 0.4.8 pyhd3eb1b0_0
pyasn1-modules 0.2.8 py_0
pycparser 2.21 pyhd3eb1b0_0
pygments 2.17.2 pyhd8ed1ab_0 conda-forge
pyjwt 2.4.0 py38hecd8cb5_0
pyopenssl 23.2.0 py38hecd8cb5_0
pysocks 1.7.1 py38_1
python 3.8.18 h218abb5_0
python-dateutil 2.9.0.post0 pypi_0 pypi
python_abi 3.8 2_cp38 conda-forge
pytz 2024.1 pypi_0 pypi
pyzmq 26.0.0 py38hf69f452_0 conda-forge
re2 2021.09.01 he49afe7_0 conda-forge
readline 8.2 hca72f7f_0
regex 2024.4.16 pypi_0 pypi
requests 2.31.0 py38hecd8cb5_1
requests-oauthlib 1.3.0 py_0
rsa 4.7.2 pyhd3eb1b0_1
scipy 1.10.1 py38hf241641_1
setuptools 68.2.2 py38hecd8cb5_0
silence-tensorflow 1.2.1 pypi_0 pypi
six 1.15.0 pyhd3eb1b0_0
snappy 1.1.10 hcec6c5f_1
sqlite 3.41.2 h6c40b1e_0
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
support-developer 1.0.5 pypi_0 pypi
tbb 2021.8.0 ha357a0b_0
tensorboard 2.11.0 py38_0
tensorboard-data-server 0.6.1 py38h7242b5c_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.6.0 py38h52b2510_1 conda-forge
tensorflow-base 2.6.0 py38h1615122_1 conda-forge
tensorflow-estimator 2.6.0 py38h02c4698_1 conda-forge
termcolor 1.1.0 py38hecd8cb5_1
tk 8.6.12 h5d9f67b_0
torch 1.11.0 pypi_0 pypi
tornado 6.4 py38hae2e43d_0 conda-forge
tqdm 4.66.2 pypi_0 pypi
traitlets 5.14.3 pyhd8ed1ab_0 conda-forge
typing_extensions 3.7.4.3 pyha847dfd_0
urllib3 2.1.0 py38hecd8cb5_1
viennarna 2.6.4 py38pl5321hda9a618_0 bioconda
wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge
werkzeug 2.3.8 py38hecd8cb5_0
wheel 0.41.2 py38hecd8cb5_0
wrapt 1.12.1 py38haf1e3a3_1
xz 5.4.6 h6c40b1e_0
yarl 1.9.3 py38h6c40b1e_0
zeromq 4.3.5 h93d8f39_0 conda-forge
zipp 3.17.0 py38hecd8cb5_0
zlib 1.2.13 h8a1eda9_5 conda-forge
So I would guess tensorflow 2.9.3
(cannot import properly) vs. tensorflow 2.6.0
(works with genet 0.14.1) has something to do with it... Note that Python is different too.
Does that help?
I think this issue seems to be a separate bug, so I've opened a new issue. Can we continue the discussion in #92?
Thank you!
Is this the right repo to raise the issue? Or you prefer in DeepPrime repo? I'll put it here as this is where I got the example from documentation.
Input sequences in uppercase or lowercase give different outputs, which surely is not right...
Here is just taking the example from documentation:
From what I can tell, this issue does not seem to occur with
DeepPrime(...)
, i.e. whether wt_seq & mut_seq are lowercase or uppercase gives the same dataframe afterpegrna.predict(...)
.