Closed umasstr closed 1 year ago
Hey @umasstr, thank you for reporting these bugs. Appreciate it!
(1) The model file for ProteinBERT-RBP component (ProteinBERT_TrainWithWholeProteinSet_defaultSetting_ModelFile.pkl.gz) can be downloaded from here. You may need to gunzip it before use. Yes, you can use this model without any training. It has already been trained with our training set composed of human RBPs and other proteins.
(2) The FileNotFoundError has been fixed. You could try re-install HydRA package and re-run the prediction.
python3 -m pip install --index-url https://test.pypi.org/simple/ --no-deps --upgrade hydra-rbp
Thank you! And let me know if it works on your side.
Hi @Wenhao-Jin, thanks for the response. Unfortunately, this doesn't run either. I ran this in a container (umasstr/hydra:latest), installing hydra from scratch, so there shouldn't be any dependency issues.
(HydRa) root@809c3ccff36d:/DATA# HydRa2_predict --seq_dir sequences --proteinBERT_modelfile ProteinBERT_TrainWithWholeProteinSet_defaultSetting_ModelFile.pkl --outdir out -n H01 --no-PIA --no-PPA
2023-01-14 16:58:02.627641: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-01-14 16:58:02.628262: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "/opt/conda/envs/HydRa/bin/HydRa2_predict", line 5, in <module>
from HydRa.HydRa2_0_predict import call_main
File "/opt/conda/envs/HydRa/lib/python3.8/site-packages/HydRa/__init__.py", line 1, in <module>
from .models import *
File "/opt/conda/envs/HydRa/lib/python3.8/site-packages/HydRa/models/__init__.py", line 2, in <module>
from .Sequence_class import Protein_Sequence_Input5, Protein_Sequence_Input5_2, Protein_Sequence_Input5_noSS, Protein_Sequence_Input5_2_noSS
File "/opt/conda/envs/HydRa/lib/python3.8/site-packages/HydRa/models/Sequence_class.py", line 727, in <module>
class Protein_Sequence_Input5_bk:
File "/opt/conda/envs/HydRa/lib/python3.8/site-packages/HydRa/models/Sequence_class.py", line 731, in Protein_Sequence_Input5_bk
def __init__(self, files, class_labels, BioVec_name_dict=BioVec_name_dict, max_seqlen=1500):
NameError: name 'BioVec_name_dict' is not defined
Hi @umasstr,
Sorry for the inconvenience. And thank you for reporting this! I just fixed this new error and will update the python package on PyPI/TestPyPI later today, so you can re-install it using the same command above: python3 -m pip install --index-url https://test.pypi.org/simple/ --no-deps --upgrade hydra-rbp
. Alternatively, if you wanna try it now, you could go to your "/opt/conda/envs/HydRa/lib/python3.8/site-packages/HydRa/models/Sequence_class.py" file, and replace BioVec_name_dict=BioVec_name_dict
to BioVec_name_dict
on line 731.
Please let me know if it still doesn't work on your side. Thank you!
Hi @umasstr, the package on TestPyPI is updated now. You can use the same command to upgrade HydRA: python3 -m pip install --index-url https://test.pypi.org/simple/ --no-deps --upgrade hydra-rbp
. It should work now.
Hey @Wenhao-Jin, I rebuilt the container and everything looks good! predict ran to completion on a small dataset. I'll give occlusion_map a try shortly.
I pushed umasstr/hydra:latest to docker. feel free to retag and post if this is useful to anyone. I forgot to activate the conda env in the build file (need to enter below command before use), but otherwise works well.
conda activate HydRa
Thank you again for responding to the issue--you'll get your beer in the mail!
Hey @umasstr, thank you so much for reporting the bugs and making the docker image! Really appreciate it!!
Hi @Wenhao-Jin,
I am wondering if there is a model posted in release assets that I am not seeing? I cannot find "ProteinBERT_TrainWithWholeProteinSet_defaultSetting_ModelFile.pkl" in this or the proteinBERT repo.
If I am reading correctly, I can use your model without training my own?
Seemingly unrelated, it HydRa2_predict may be looking for a file outside of your environment. "/home/wjin/projects/RBP_pred/RBP_identification/Data/protVec_100d_3grams.csv"
Thanks!