I was receiving the below error when training which seems to be a result of a backwards-incompatible change in PyTorch 1.11.0, as pointed out in PyTorch issue #75018 regarding StyleGAN3.
Traceback (most recent call last):
File "/home/ubuntu/stylegan-xl/train.py", line 336, in <module>
main() # pylint: disable=no-value-for-parameter
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/ubuntu/stylegan-xl/train.py", line 321, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "/home/ubuntu/stylegan-xl/train.py", line 104, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "/home/ubuntu/stylegan-xl/train.py", line 49, in subprocess_fn
training_loop.training_loop(rank=rank, **c)
File "/home/ubuntu/stylegan-xl/training/training_loop.py", line 339, in training_loop
loss.accumulate_gradients(phase=phase.name, real_img=real_img, real_c=real_c, gen_z=gen_z, gen_c=gen_c, gain=phase.interval, cur_nimg=cur_nimg)
File "/home/ubuntu/stylegan-xl/training/loss.py", line 121, in accumulate_gradients
loss_Gmain.backward()
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/_tensor.py", line 522, in backward
torch.autograd.backward(
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/autograd/__init__.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/autograd/function.py", line 289, in apply
return user_fn(self, *args)
File "/home/ubuntu/stylegan-xl/torch_utils/ops/conv2d_gradfix.py", line 144, in backward
grad_weight = Conv2dGradWeight.apply(grad_output, input)
File "/home/ubuntu/miniconda3/envs/sgxl/lib/python3.9/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/ubuntu/stylegan-xl/torch_utils/ops/conv2d_gradfix.py", line 173, in forward
return torch._C._jit_get_operation(name)(weight_shape, grad_output, input, padding, stride, dilation, groups, *flags)
TypeError: 'tuple' object is not callable
@jannehellsten pushed a change to StyleGAN3 to fix this issue according to their comment on the aforementioned PyTorch issue.
After applying these same changes locally to conv2d_gradfix.py and grid_sample_gradfix.py in stylegan-xl, I can confirm that the model is training smoothly on my custom dataset.
I was receiving the below error when training which seems to be a result of a backwards-incompatible change in PyTorch 1.11.0, as pointed out in PyTorch issue #75018 regarding StyleGAN3.
@jannehellsten pushed a change to StyleGAN3 to fix this issue according to their comment on the aforementioned PyTorch issue.
After applying these same changes locally to conv2d_gradfix.py and grid_sample_gradfix.py in stylegan-xl, I can confirm that the model is training smoothly on my custom dataset.