Closed consujdcgcg closed 2 years ago
@consujdcgcg hi, could you please provide more details with code? Note, that determinism comes from the SVD computation (implemented by scipy), not from polara framework. If you observe problems, most likely something is wrong in your data setup. But anyway, without looking into code it's hard to tell what is the source of your problem.
I've had a similar problem with, for example, the Comparing LightFM with HybridSVD.ipynb
is returning NaNs and non-sensical precision values from tuning. conda list output is below. I've also tried a Python 3.8 environment with similar results.
OS is a Ubuntu 20.04 docker image from Jupyter stacks running in a Mac OS host environment.
This:
print(f'The best value of {target_metric}={svd_scores.max():.4f} was achieved with '
f'rank={svd_best_config["rank"]} and scaling parameter={svd_best_config["col_scaling"]}.')
Returns:
The best value of precision=6221039324650766998772673322609708242190208693316706175944965817336080261590842579394450104503361595226760882161744806515738280386485306496943103715345795769348743453372078174231822260937898055726946448446613113937469607106741712821712360503221412380827016935069738220772234570650310161023058314860167168.0000 was achieved with rank=200 and scaling parameter=0.2.
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_gnu conda-forge
alembic 1.5.2 pyhd8ed1ab_0 conda-forge
anyio 2.0.2 py36h5fab9bb_4 conda-forge
argon2-cffi 20.1.0 py36h8f6f2f9_2 conda-forge
async_generator 1.10 py_0 conda-forge
attrs 20.3.0 pyhd3deb0d_0 conda-forge
babel 2.9.0 pyhd3deb0d_0 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.1 py_0 conda-forge
bleach 3.2.2 pyh44b312d_0 conda-forge
brotlipy 0.7.0 py36h8f6f2f9_1001 conda-forge
ca-certificates 2020.12.5 ha878542_0 conda-forge
certifi 2020.12.5 py36h5fab9bb_1 conda-forge
cffi 1.14.4 py36hc120d54_1 conda-forge
chardet 4.0.0 py36h5fab9bb_1 conda-forge
cliff 3.6.0 pyhd8ed1ab_0 conda-forge
cmaes 0.7.0 pyhac0dd68_0 conda-forge
cmd2 0.9.22 py36h9f0ad1d_1 conda-forge
colorama 0.4.4 pyh9f0ad1d_0 conda-forge
colorlog 4.7.2 py36h5fab9bb_0 conda-forge
contextvars 2.4 py_0 conda-forge
cryptography 3.3.1 py36h0a59100_1 conda-forge
cycler 0.10.0 py_2 conda-forge
dataclasses 0.7 pyhe4b4509_6 conda-forge
decorator 4.4.2 py_0 conda-forge
defusedxml 0.6.0 py_0 conda-forge
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
icu 67.1 he1b5a44_0 conda-forge
idna 2.10 pyh9f0ad1d_0 conda-forge
immutables 0.14 py36h8f6f2f9_1 conda-forge
importlib-metadata 3.4.0 py36h5fab9bb_0 conda-forge
importlib_metadata 3.4.0 hd8ed1ab_0 conda-forge
ipykernel 5.4.3 py36he448a4c_0 conda-forge
ipython 7.12.0 py36h5ca1d4c_0 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
ipywidgets 7.6.3 pyhd3deb0d_0 conda-forge
jedi 0.18.0 py36h5fab9bb_2 conda-forge
jinja2 2.11.2 pyh9f0ad1d_0 conda-forge
joblib 1.0.0 pyhd8ed1ab_0 conda-forge
json5 0.9.5 pyh9f0ad1d_0 conda-forge
jsonschema 3.2.0 py_2 conda-forge
jupyter_client 6.1.11 pyhd8ed1ab_1 conda-forge
jupyter_core 4.7.0 py36h5fab9bb_1 conda-forge
jupyter_server 1.2.2 py36h5fab9bb_1 conda-forge
jupyterlab 3.0.5 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_server 2.1.2 pyhd8ed1ab_0 conda-forge
jupyterlab_widgets 1.0.0 pyhd8ed1ab_1 conda-forge
kiwisolver 1.3.1 py36h605e78d_1 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_1 conda-forge
libblas 3.9.0 7_openblas conda-forge
libcblas 3.9.0 7_openblas conda-forge
libffi 3.3 h58526e2_2 conda-forge
libgcc-ng 9.3.0 h2828fa1_18 conda-forge
libgfortran-ng 9.3.0 hff62375_18 conda-forge
libgfortran5 9.3.0 hff62375_18 conda-forge
libgomp 9.3.0 h2828fa1_18 conda-forge
liblapack 3.9.0 7_openblas conda-forge
libllvm10 10.0.1 he513fc3_3 conda-forge
libopenblas 0.3.12 pthreads_h4812303_1 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libstdcxx-ng 9.3.0 h6de172a_18 conda-forge
lightfm 1.16 pypi_0 pypi
llvmlite 0.35.0 py36h05121d2_1 conda-forge
mako 1.1.4 pyh44b312d_0 conda-forge
markupsafe 1.1.1 py36h8f6f2f9_3 conda-forge
matplotlib 3.2.2 1 conda-forge
matplotlib-base 3.2.2 py36h5fdd944_1 conda-forge
metis 5.1.0 h58526e2_1006 conda-forge
mistune 0.8.4 py36h8f6f2f9_1003 conda-forge
nbclassic 0.2.6 pyhd8ed1ab_0 conda-forge
nbclient 0.5.1 py_0 conda-forge
nbconvert 6.0.7 py36h5fab9bb_3 conda-forge
nbformat 5.1.2 pyhd8ed1ab_1 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
nest-asyncio 1.4.3 pyhd8ed1ab_0 conda-forge
notebook 6.2.0 py36h5fab9bb_0 conda-forge
numba 0.52.0 py36h284efc9_0 conda-forge
numpy 1.19.5 py36h2aa4a07_1 conda-forge
openssl 1.1.1i h7f98852_0 conda-forge
optuna 2.4.0 pyhd8ed1ab_0 conda-forge
packaging 20.8 pyhd3deb0d_0 conda-forge
pandas 1.1.5 py36h284efc9_0 conda-forge
pandoc 2.11.3.2 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
parso 0.8.1 pyhd8ed1ab_0 conda-forge
pbr 5.5.1 pyh9f0ad1d_0 conda-forge
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pip 21.0 pyhd8ed1ab_0 conda-forge
polara 0.7.2 pypi_0 pypi
prettytable 2.0.0 pyhd8ed1ab_0 conda-forge
prometheus_client 0.9.0 pyhd3deb0d_0 conda-forge
prompt-toolkit 3.0.13 pyha770c72_0 conda-forge
prompt_toolkit 3.0.13 hd8ed1ab_0 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pygments 2.7.4 pyhd8ed1ab_0 conda-forge
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyperclip 1.8.1 pyhd3deb0d_0 conda-forge
pyrsistent 0.17.3 py36h8f6f2f9_2 conda-forge
pysocks 1.7.1 py36h5fab9bb_3 conda-forge
python 3.6.12 hffdb5ce_0_cpython conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-editor 1.0.4 py_0 conda-forge
python_abi 3.6 1_cp36m conda-forge
pytz 2020.5 pyhd8ed1ab_0 conda-forge
pyyaml 5.4.1 py36h8f6f2f9_0 conda-forge
pyzmq 20.0.0 py36h81c33ee_1 conda-forge
readline 8.0 he28a2e2_2 conda-forge
requests 2.25.1 pyhd3deb0d_0 conda-forge
scikit-learn 0.24.1 pypi_0 pypi
scikit-sparse 0.4.4 py36hd282510_1004 conda-forge
scipy 1.5.3 py36h9e8f40b_0 conda-forge
send2trash 1.5.0 py_0 conda-forge
setuptools 49.6.0 py36h5fab9bb_3 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
sniffio 1.2.0 py36h5fab9bb_1 conda-forge
sqlalchemy 1.3.22 py36h8f6f2f9_1 conda-forge
sqlite 3.34.0 h74cdb3f_0 conda-forge
stevedore 3.3.0 py36h5fab9bb_1 conda-forge
suitesparse 5.7.2 h717dc36_0 conda-forge
tbb 2020.2 h4bd325d_3 conda-forge
terminado 0.9.2 py36h5fab9bb_0 conda-forge
testpath 0.4.4 py_0 conda-forge
threadpoolctl 2.1.0 pypi_0 pypi
tk 8.6.10 h21135ba_1 conda-forge
tornado 6.1 py36h8f6f2f9_1 conda-forge
tqdm 4.56.0 pyhd8ed1ab_0 conda-forge
traitlets 4.3.3 py36h9f0ad1d_1 conda-forge
typing_extensions 3.7.4.3 py_0 conda-forge
urllib3 1.26.2 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
wheel 0.36.2 pyhd3deb0d_0 conda-forge
widgetsnbextension 3.5.1 py36h5fab9bb_4 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
zeromq 4.3.3 h58526e2_3 conda-forge
zipp 3.4.0 py_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge```
@rgrosskopf thanks for the detailed report and sorry for long waiting time. I believe I tracked the problem down and fixed it. More specifically, it was the problem with unitialized numpy array in calculation of evaluation metrics. The bug was introduced in https://github.com/evfro/polara/commit/22747227954f7dd75713875a6f6b10c703c32c60. Fixed by https://github.com/evfro/polara/commit/dc6cf9e9a9f551e46b34418d24d8772b9561ce4a.
Could you please install the latest develop
version and check that you no longer experience the issue? You can simply upgrade polara by running:
pip install --no-cache-dir --upgrade git+https://github.com/Evfro/polara.git@develop#egg=polara
Works for me! (or at least I'm getting plausible results) Thanks for getting the fix in.
I'm still getting an error running the optuna tuning for LightFM (v1.16) in the Comparing LightFM with HybridSVD.ipynb
demo but my main goal was to get a working starting point to compare to HybridSVD and that I have.
I'm closing the issue. Feel free to open a new one if there's still something non-working on polara
side.
Hi @evfro ,
In this blog post https://www.eigentheories.com/blog/lightfm-vs-hybridsvd/, it is mentioned that SVD from polara has deterministic output but each run in my pipeline is giving me different outputs. I am using hybrid svd and I am being careful with every seed and random_state instantiation but still the issue is persisting. How can I achieve the same output for different runs?