test can't run - Githubissues

JamesQFreeman commented 4 years ago

I run: python test_expVAE.py And get error as follows:

cuda available
Traceback (most recent call last):
  File "test_expVAE.py", line 99, in <module>
    main()
  File "test_expVAE.py", line 77, in main
    gcam.backward(mu, logvar, mu_avg, logvar_avg)
  File "/mnt/data/home/x/expVAE/code/gradcam.py", line 60, in backward
    self.score_fc.backward(gradient=one_hot, retain_graph=True)
  File "/mnt/data/home/x/anaconda3/envs/py38pytorch151/lib/python3.8/site-packages/torch/tensor.py", line 195, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/mnt/data/home/x/anaconda3/envs/py38pytorch151/lib/python3.8/site-packages/torch/autograd/__init__.py", line 93, in backward
    grad_tensors = _make_grads(tensors, grad_tensors)
  File "/mnt/data/home/x/anaconda3/envs/py38pytorch151/lib/python3.8/site-packages/torch/autograd/__init__.py", line 25, in _make_grads
    raise RuntimeError("Mismatch in shape: grad_output["
RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([128, 32]) and output[0] has a shape of torch.Size([]).

bragilee commented 4 years ago

Hi, could you please provide more information, like code related to this error.

I just reproduced experiments and it runs successfully without errors.

Thanks.

iremcetin commented 4 years ago

Hi, I encounter the exact same mismatch in shape problem and error, as you can see the error below :

Traceback (most recent call last):

File "", line 1, in runfile('/expVAE-master/code/test_expVAE.py', wdir='/expVAE-master/code')

File "/expVAE-master/code/test_expVAE.py", line 99, in main()

File "/expVAE-master/code/test_expVAE.py", line 77, in main gcam.backward(mu, logvar, mu_avg, logvar_avg)

File "/expVAE-master/code/gradcam.py", line 60, in backward self.score_fc.backward(gradient=one_hot, retain_graph=True)

File "/python3.7/site-packages/torch/tensor.py", line 185, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph)

File "/home/x/.local/lib/python3.7/site-packages/torch/autograd/init.py", line 121, in backward grad_tensors = _make_grads(tensors, grad_tensors)

File "/home/x/.local/lib/python3.7/site-packages/torch/autograd/init.py", line 34, in _make_grads

str(out.shape) + ".")

RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([128, 32]) and output[0] has a shape of torch.Size([]).

I assume the problem is the definition of the one_hot as the error comes from the line 60 in gradcam.py --> self.score_fc.backward(gradient=one_hot, retain_graph=True)

liuem607 commented 4 years ago

Hi,

We would like to help if you could share more informations on your experiments. Did you make any modifications on the code before encountering this error? What was the target layer you chose in line69 in test_expVAE.py, where the code calls the GradCAM function?

Thank,

iremcetin commented 4 years ago

Hi, I did not change the code. I used the line 69 as it is, which means: gcam = GradCAM(model, target_layer='encoder.2', cuda=True) I am not sure if I need to change something here?

liuem607 commented 4 years ago

Hi， The code without any modification runs perfectly on my computer. I suggest you investigate further any differences in your environment/experimental setup with respect to the instructions to how to run the code.

KJ-Waller commented 3 years ago

Hi. I'm encountering the same error running the test script, exactly the same error as described by @JamesQFreeman. It seems to be due to a version mismatch with PyTorch, ~~although I'm not sure exactly how to fix it for PyTorch 1.7.1. However, I was able to get the code to run by creating a new anaconda environment with PyTorch 1.0 and Cuda 9.0. conda install pytorch==1.0.0 torchvision==0.2.1 cuda90 -c pytorch~~

EDIT: I was able to get it to run with PyTorch 1.7.1 by changing line 60 in gradcam.py to self.score_fc.backward(retain_graph=True), so just removing gradient=one_hot.

liuem607 commented 3 years ago

Hi @KJ-Waller , thank you very much for this update. I followed your previous comment and tried to run with pytorch1.3 and 1.5, and I got this error that I couldn't reproduce with a PyTorch version<1.0. I also went into the same link you shared. But I would suggest the fix exactly as you said: just remove gradient=one_hot in line60 of gradcam.py. With this, I got the same results as before(using Pytorch1.0).

To me, it definitely has to do with the new updates of PyTorch. But unfortunately I'm not exactly sure how they affect auto-gradients in our case.

liuem607 / expVAE

test can't run #1