training with stylegan2 RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented

aduchon commented 1 year ago

love the promise of this, but

in a colab ! !python /content/vision-aided-gan/stylegan2/vision-aided-gan.py --outdir {experiment_dir} \ --data {dataset_dir} --cfg paper256_2fmap --mirror 1 \ --aug ada --augpipe bgc --augcv ada --batch 16 --gpus 1 \ --kimgs-list '1000,1000,1000' --num 3

lots of /content/vision-aided-gan/stylegan2/torch_utils/ops/conv2d_gradfix.py:55: UserWarning: conv2d_gradfix not supported on PyTorch 1.13.0+cu116. Falling back to torch.nn.functional.conv2d(). warnings.warn(f'conv2d_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.conv2d().') and File "/usr/local/lib/python3.8/dist-packages/torch/autograd/__init__.py", line 197, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented

with !python /content/vision-aided-gan/stylegan2/train.py --outdir {experiment_dir} \ --data {dataset_dir} --cfg paper256_2fmap --mirror 1 \ --aug ada --augpipe bgc --augcv ada --batch 16 --gpus 1 RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented

with !python /content/vision-aided-gan/stylegan3/train.py --outdir {experiment_dir} \ --data {dataset_dir} --kimg 4000 --cfg stylegan3-t --gpus 1 --gamma 10 \ --batch 16 --cv input-clip-output-conv_multi_level \ --cv-loss multilevel_sigmoid_s --mirror 1 --aug ada --warmup 5e5

File "/content/vision-aided-gan/stylegan3/torch_utils/ops/grid_sample_gradfix.py", line 59, in forward grad_input, grad_grid = op(grad_output, input, grid, 0, 0, False) TypeError: 'tuple' object is not callable

nupurkmr9 commented 1 year ago

Hi, I think this error is because of stylegan2/3 code being incompatible with newer versions of PyTorch, example here. Can you try with a lower version of PyTorch e.g., 1.10.0? Let me know if this resolves this issue. Thanks.

RanQChi commented 1 year ago

Hi, I think this error is because of stylegan2/3 code being incompatible with newer versions of PyTorch, example here. Can you try with a lower version of PyTorch e.g., 1.10.0? Let me know if this resolves this issue. Thanks.

not work for 1.10.0, it should work for 1.9.1.

aduchon commented 1 year ago

I was able to get it to run stylegan3 (only) installing things in this order on a colab with gpu. !git clone https://github.com/nupurkmr9/vision-aided-gan.git !pip install clip@git+https://github.com/openai/CLIP.git !pip install wandb !pip install clean-fid !pip install ninja !pip install vision_aided_loss !pip install lpips !pip3 install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html But, the results weren't great, just getting some wavy circles, so there must be some other parameters or something.

RanQChi commented 1 year ago

Hello~ it works for me~ install the gxx=8.5 first then go with the torch 1.13+ cuda11.6 + ninja. Down with above, changed the tensorboard init, then changed the grid_sample_gradfix and conv2d_gradix also changed the sentence of the torch version in stylegan.

The reason is just the modification of the op (grid_sampler_2dbacward) changed for different pytorch version. Past version only use the op=xxxx current version changed into a tuple as op,=xxx and the input add a output mask composed by ctx_input1and ctx_input2.

https://github.com/vsemecky/stylegan3/commit/f61d77a0a5a0fbdb31bcf962ab3d5c7c7a9f0ef6

ganguliSilva commented 1 year ago

What does you mean by "Down with above, changed the tensorboard init, " and how to do it?

nupurkmr9 / vision-aided-gan

training with stylegan2 RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented #11