Closed DaHaiHuha closed 3 years ago
Hi! Thanks for sharing the code, datasets and models!
Any idea on what might be happening here? I am facing the same problem when trying to reproduce the similarity results reported in the paper. I run "eval_similarity.py" with the provided SSA models (full and without contact prediction) and test datasets (2.06-test and 2.07-new). The obtained results present exactly the same performance drop as the ones reported by @DaHaiHuha (using PyTorch version 1.2.0).
I would appreciate any kind of help! Thank you so much!
Strange. Is this still the case if you use pytorch 0.4.0?
Thanks for your response! Yes, same results (performance drop) when running it in a conda environment with pytorch 0.4.0. These are the rest of libraries:
I get the expected performance metrics with the following conda environment:
# Name Version Build Channel
attrs 17.3.0 pypi_0 pypi
biopython 1.69 np113py36_0
blas 1.0 mkl
bleach 1.5.0 py36_0
certifi 2016.2.28 py36_0
cffi 1.10.0 py36_0
cuda90 1.0 h6433d27_0 pytorch
cuda91 1.0 h4c16780_0 pytorch
cudatoolkit 9.2 0
cycler 0.10.0 py36_0
cython 0.26 py36_0
dbus 1.10.22 h3b5a359_0
decorator 4.1.2 py36_0
entrypoints 0.2.3 py36_0
expat 2.1.0 0
fontconfig 2.12.4 h88586e7_1
freetype 2.8 hab7d2ae_1
glib 2.53.6 h5d9569c_2
goatools 0.7.11 pypi_0 pypi
gst-plugins-base 1.12.4 h33fb286_0
gstreamer 1.12.4 hb53b477_0
h5py 2.7.1 py36h3585f63_0
hdf5 1.10.1 h9caa474_1
html5lib 0.9999999 py36_0
icu 58.2 h9c2bf20_1
intel-openmp 2018.0.0 hc7b2577_8
ipykernel 4.6.1 py36_0
ipython 6.1.0 py36_0
ipython_genutils 0.2.0 py36_0
ipywidgets 6.0.0 py36_0
jedi 0.10.2 py36_2
jinja2 2.9.6 py36_0
jpeg 9b 0
jsonschema 2.6.0 py36_0
jupyter 1.0.0 py36_4
jupyter_client 5.1.0 py36_0
jupyter_console 5.2.0 py36_0
jupyter_core 4.3.0 py36_0
kiwisolver 1.0.1 py36h764f252_0
libedit 3.1 heed3624_0
libffi 3.2.1 1
libgcc 5.2.0 0
libgcc-ng 7.2.0 h7cc24e2_2
libgfortran 3.0.0 1
libgfortran-ng 7.2.0 h9f7466a_2
libiconv 1.14 0
libpng 1.6.34 hb9fc6fc_0
libsodium 1.0.10 0
libstdcxx-ng 7.2.0 h7a57d05_2
libtiff 4.0.9 he85c1e1_1
libxcb 1.12 1
libxml2 2.9.4 0
markupsafe 1.0 py36_0
matplotlib 2.2.2 py36h0e671d2_0
mistune 0.7.4 py36_0
mkl 2018.0.1 h19d6760_4
nbconvert 5.2.1 py36_0
nbformat 4.4.0 py36_0
ncurses 6.0 h9df7e31_2
ninja 1.8.2 h6bb024c_1
nltk 3.2.4 py36_0
nose 1.3.7 pypi_0 pypi
notebook 5.0.0 py36_0
numpy 1.13.3 py36h3dfced4_2
olefile 0.45.1 py36_0
openssl 1.0.2l 0
pandas 0.20.3 py36_0
pandocfilters 1.4.2 py36_0
path.py 10.3.1 py36_0
patsy 0.4.1 py36_0
pcre 8.39 1
pexpect 4.2.1 py36_0
pickleshare 0.7.4 py36_0
pillow 5.1.0 py36h3deb7b8_0
pip 9.0.1 py36_1
pluggy 0.6.0 pypi_0 pypi
prompt_toolkit 1.0.15 py36_0
ptyprocess 0.5.2 py36_0
py 1.5.2 pypi_0 pypi
pycparser 2.18 py36_0
pygments 2.2.0 py36_0
pyparsing 2.2.0 py36_0
pyqt 5.6.0 py36_2
pytest 3.3.1 pypi_0 pypi
python 3.6.3 h0ef2715_3
python-crfsuite 0.9.5 pypi_0 pypi
python-dateutil 2.6.1 py36_0
pytorch 0.4.0 py36_cuda9.1.85_cudnn7.1.2_1 [cuda91] pytorch
pytz 2017.2 py36_0
pyzmq 16.0.2 py36_0
qt 5.6.2 h974d657_12
qtconsole 4.3.1 py36_0
readline 7.0 ha6073c6_4
requests 2.14.2 py36_0
scikit-learn 0.19.1 py36h7aa7ec6_0
scipy 1.0.0 py36hbf646e7_0
seaborn 0.8 py36_0
setuptools 36.4.0 py36_1
simplegeneric 0.8.1 py36_1
sip 4.18 py36_0
six 1.10.0 py36_0
sqlite 3.23.1 he433501_0
statsmodels 0.8.0 np113py36_0
terminado 0.6 py36_0
testpath 0.3.1 py36_0
tk 8.6.7 hc745277_3
tornado 4.5.2 py36_0
traitlets 4.3.2 py36_0
wcwidth 0.1.7 py36_0
wget 3.2 pypi_0 pypi
wheel 0.29.0 py36_0
widgetsnbextension 3.0.2 py36_0
xlrd 1.1.0 pypi_0 pypi
xlsxwriter 1.0.2 pypi_0 pypi
xz 5.2.3 0
zeromq 4.1.5 0
zlib 1.2.11 0
and command:
python eval_similarity.py pretrained_models/ssa_L1_100d_lstm3x512_lm_i512_mb64_tau0.5_lambda0.1_p0.05_epoch100.sav
If you rollback all of your packages to the above versions, do you get the expected results?
I can't reproduce this error with newer packages either. The following conda environment gives the expected output for me.
# Name Version Build Channel
_libgcc_mutex 0.1 main
_pytorch_select 0.2 gpu_0
blas 1.0 mkl
ca-certificates 2020.10.14 0
certifi 2020.6.20 pyhd3eb1b0_3
cffi 1.14.3 py37he30daa8_0
cudatoolkit 9.2 0
cython 0.29.21 py37h2531618_0
intel-openmp 2020.2 254
joblib 0.17.0 py_0
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
mkl 2020.2 256
mkl-service 2.3.0 py37he904b0f_0
mkl_fft 1.2.0 py37h23d657b_0
mkl_random 1.1.1 py37h0573a6f_0
ncurses 6.2 he6710b0_1
ninja 1.10.1 py37hfd86e86_0
numpy 1.19.2 py37h54aff64_0
numpy-base 1.19.2 py37hfa32c7d_0
openssl 1.1.1h h7b6447c_0
pandas 1.1.3 py37he6710b0_0
pip 20.2.4 py37h06a4308_0
pycparser 2.20 py_2
python 3.7.9 h7579374_0
python-dateutil 2.8.1 py_0
pytorch 1.2.0 py3.7_cuda9.2.148_cudnn7.6.2_0 pytorch
pytz 2020.1 py_0
readline 8.0 h7b6447c_0
scikit-learn 0.23.2 py37h0573a6f_0
scipy 1.5.2 py37h0b6359f_0
setuptools 50.3.1 py37h06a4308_1
six 1.15.0 py37h06a4308_0
sqlite 3.33.0 h62c20be_0
threadpoolctl 2.1.0 pyh5ca1d4c_0
tk 8.6.10 hbc83047_0
tzdata 2020d h14c3975_0
wheel 0.35.1 pyhd3eb1b0_0
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3
Sorry, I still cannot reproduce the results. Is there any chance we are using different training or testing pairs? I mean the ones downloaded from this github repository vs the ones you hold locally.
These are my train/test results using eval_similarity.py
:
Model Dataset Acc Pearson Spearm Class Fold Supfam Family
----------------------------------------------------------------------------------------
SSA-similarity 2.06-train 0.97809 0.94249 0.70616 0.97972 0.88670 0.91647 0.72263
2.06-test 0.91549 0.82025 0.65051 0.83429 0.77623 0.84859 0.52733
----------------------------------------------------------------------------------------
SSA-lambda0.1 2.06-train 0.99816 0.99562 0.71213 1.00000 0.99296 0.99575 0.92497
2.06-test 0.94926 0.89735 0.68512 0.89839 0.88330 0.94330 0.65151
Pre-trained models:
SSA-similarity: ssa_L1_100d_lstm3x512_lm_i512_mb64_tau0.5_p0.05_epoch100.sav
SSA-lambda0.1: ssa_L1_100d_lstm3x512_lm_i512_mb64_tau0.5_lambda0.1_p0.05_epoch100.sav
Datasets:
2.06-train: astral-scopedom-seqres-gd-sel-gs-bib-95-2.06.train.sampledpairs.txt
2.06-test: astral-scopedom-seqres-gd-sel-gs-bib-95-2.06.test.sampledpairs.txt
Thanks for your kind help!
Ok, I managed to reproduce this error and figured out the issue. I had tweaked the BiLM model code when I released the code and it's causing the performance drop. I'll revert the code and it should solve the problem.
This should now be fixed with commit 89a0ac2f92fea164c9d39eed167348639f3c82a7.
Great work and thanks for releasing the codes + dataset + pre-trained model.
But I still have some questions about the training procedure, could you kindly spare some time to review the process? I’ve used the codes in Github repo to train the model for several times but failed to reproduce the result (the gap is about 5%), I was wondering that there are some differences between how your released model is trained and how I trained the models.
The training details are as follows:
Here are the questions I would like to know the answers, could you kindly answer them? Is the released LM same as the one you used for training? Shall I modify the code to reproduce the result? I noticed that when loading the samples for SCOP task, the number is 22408 but after resampled to match CMAP task, the number is only about 10% left, that is 2241. So I was wondering whether the resampled dataset matters. Is the released model obtained by searching some hyperparameters? If yes, how does it be done?
Besides, I revised the source code a little bit and submitted a PR to Github: https://github.com/tbepler/protein-sequence-embedding-iclr2019/pull/18/commits/dc75f65c1734e7b696825fdefbb4bdc64385d6ae Will this lead to a performance drop?
The evaluation of the models are as follows: Results from eval_similarity.py
Results from eval_similarity.py & eval_secstr.py
Results from eval_contact_scop.py
Results from eval_transmembrane.py
Any suggestions will be appreciated!