greentfrapp / lucent

Lucid library adapted for PyTorch
Apache License 2.0
595 stars 88 forks source link

Code Breaks as GPU Index > 0 #40

Open Haoxiang-Wang opened 1 year ago

Haoxiang-Wang commented 1 year ago

When using GPU, this codebase only works for torch.device('cuda:0') -- the GPU index has to be 0.

For example, if you choos torch.device('cuda:1'), then when you run the code demo

import torch

from lucent.optvis import render
from lucent.modelzoo import inceptionv1

# Let's use cuda:1
device = torch.device("cuda:1")
model = inceptionv1(pretrained=True)
model.to(device).eval()

render.render_vis(model, "mixed4a:476")

you will see an error like

..........
File .....lucent/optvis/render.py:206, in hook_model.<locals>.hook(layer)
    204     assert layer in features, f"Invalid layer {layer}. Retrieve the list of layers with `lucent.modelzoo.util.get_model_layers(model)`."
    205     out = features[layer].features
--> 206 assert out is not None, "There are no saved feature maps. Make sure to put the model in eval mode, like so: `model.to(device).eval()`. See README for example."
    207 return out

AssertionError: There are no saved feature maps. Make sure to put the model in eval mode, like so: `model.to(device).eval()`. See README for example.
cest-andre commented 1 month ago

I've found a solution for purposes, and hopefully it works for anyone having this problem.

Looks like the device is hardcoded to be "cuda:0" in various parts of the code, leading to conflict. I set device="cuda:1" in optvis.transform, optvis.param.spatial, and optvis.param.color, and everything appears to be working now.

greentfrapp commented 1 month ago

Thanks @Haoxiang-Wang and @cest-andre for reporting this!

I've just created a branch called refactor-device which should help to resolve this e.g. the render_vis function should automatically detect the appropriate model device.

Will test this and merge to dev if there are no issues.