RLPR / LabelReviews

Reproducible Label Reviews
https://rlpr.github.io
4 stars 0 forks source link

S18 : Reproducing the sparse Huffman Address Map compression for deep neural networks #18

Open dariomalchiodi opened 3 years ago

dariomalchiodi commented 3 years ago

General info

Reviewer feedback

Details Results

akrah commented 3 years ago

Hi @dariomalchiodi, I started the review. When I run runner.sh, I have a lot of error messages like this one:

ImportError: No module named keras
Traceback (most recent call last):
  File "weightsharing.py", line 1, in <module>
    import keras

Please, could you indicate how to install all dependencies in your README.md?

dariomalchiodi commented 3 years ago

Thanks for the rapid feedback! Actually runner.sh should create a virtual environment and automatically install all libraries (including keras). We tested it on different machines and we didn't experience problems, so I have to figure out what is going on. My first hypothesis is that "python" might be bound to a 2.X version on the machine you are using. We have added an alias at the beginning of runner.sh, so that "python" is explicitly bound to python 3. Could you please re-clone the repo (or pull it and delete the "venv_compr" directory created when building the virtual environment) and re-run runner.sh?

Thanking in advance, Dario Malchiodi

akrah commented 3 years ago

I pulled the commits but define an alias in a shell script doesn't work: you need to add shopt -s expand_aliases to be able to use a defined alias. Anyway, you was right, it was the problem and I fixed it by defining an alias in the shell before to run the script.

I actually have in output the following error about th file https://github.com/giosumarin/ICPR2020_sHAM/blob/master/nets/DeepDTA/KIBA/weightsharing.py :

Using TensorFlow backend.
Traceback (most recent call last):
  File "weightsharing.py", line 143, in <module>
    TRAIN_RES = ([pre_pruning_train] + [post_pruning_train] + ws_model.acc_train)
NameError: name 'pre_pruning_train' is not defined

and this one about CUDA :

W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)

About CUDA error, probably you need to specify that there is a package/driver to install?

Despite these errors, the execution continue.

Actually I already recommand to add the following informations for users in the README.md file:

dariomalchiodi commented 3 years ago

Thanks for the advice. We have a more detailed README, showing your suggested information and including a link for CUDA driver installation. We also have provided two different scripts, respectively for GPU- and CPU-based systems (in the latter only a warning about missing CUDA driver should now be issued). However, only relying on CPU is likely to require a huge amount of time.