salaniz / pytorch-gve-lrcn

PyTorch implementations for "Generating Visual Explanations" (GVE) and "Long-term Recurrent Convolutional Networks" (LRCN)
MIT License
92 stars 22 forks source link

Outdated requirements.txt? #12

Closed narendoraiswamy closed 5 years ago

narendoraiswamy commented 5 years ago

Hello,

Thank you for the pytorch version of the code. With the provided requirements.txt file(which doesn't have the versions used for every dependent package), I get the conventional error, "Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-1_ps2cdw/matplotlib/". This usually occurs due to the setuptools package which might have not been up to date. Unfortunately, all the available solutions to this problem have been tried and haven't been of any help to me.

Hence, is it possible for you to provide an updated requirements.txt file or the environment.yml file containing all the dependencies?

Thank you.

salaniz commented 5 years ago

Even though I cannot reproduce your problem, I can see that not having a proper requirements/environment file with version numbers can be problematic.

I will make some test with the latest versions of PyTorch (and the other requirements) and then create a proper conda environment that should work out of the box.

narendoraiswamy commented 5 years ago

That would be absolutely helpful. Looking forward to the updated requirements.txt/environment.yml file. However can you provide an estimate as to when it would be available if you don't mind?

Thank you.

salaniz commented 5 years ago

I have pushed the changes. Can you please check if creating a conda environment with the requirements from the environment.yml file resolves your issues?

narendoraiswamy commented 5 years ago

Yes, thank you. The new environment.yml file solves the problem. However GVE model training is occurring on the CPU. You probably have to push it to the device in the gve_trainer.py file.

salaniz commented 5 years ago

Are you sure that you have nvidia drivers installed? You can check if torch is able to use cuda by running: torch.cuda.is_available() which should return True.

The GVETrainer class inherits from LRCNTrainer where the model is pushed to the cuda device if possible.

narendoraiswamy commented 5 years ago

Yes. The drivers are installed and interestingly the torch.cuda.is_available() gives False. I am assuming it is due to driver version differences or due to CUDA 9 and 10 differences. I use CUDA 9 and the environment uses CUDA 10. However I believe there should be backward compatibility between the two.

narendoraiswamy commented 5 years ago

The issue is due to the incompatibility between the nvidia drivers and the CUDA Toolkit. I am using a pretty old version, 384.130 and we need >=410.48 for CUDA 10. Hence the problem.

But I will close the issue here. Thank you for your response:)

salaniz commented 5 years ago

Ok, have you tried changing the cudatoolkit requirement in environment.yml from cudatoolkit=10.0.130 to cudatoolkit=9.0 and then building the environment? I think that should still work.

narendoraiswamy commented 5 years ago

Yes. I did the same and it works without any other dependencies breaking:+1: Thank you.