NVlabs / ODISE

Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
https://arxiv.org/abs/2303.04803
Other
845 stars 45 forks source link

RuntimeError: expected scalar type Half but found Float #26

Open caijunhao opened 1 year ago

caijunhao commented 1 year ago

Thanks for your great work!

When I run the code tools/train_net.py with 2 V100 GPUs, I encounter the follow error:

File "/mnt/cap/caijh/app/src/detectron2/detectron2/engine/train_loop.py", line 155, in train
    self.run_step()
  File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 297, in run_step
    grad_norm = self.grad_scaler(
  File "/mnt/workspace/code/ODISE/odise/engine/train_loop.py", line 207, in __call__
    self._scaler.scale(loss).backward(create_graph=create_graph)
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/autograd/function.py", line 267, in apply
    return user_fn(self, *args)
  File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/diffusionmodules/util.py", line 138, in backward
    output_tensors = ctx.run_function(*shallow_copies)
  File "/mnt/workspace/code/ODISE/third_party/stable-diffusion/ldm/modules/attention.py", line 212, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/mnt/cap/caijh/anaconda3/envs/odise/lib/python3.9/site-packages/torch/nn/functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Half but found Float

The arguments are

./tools/train_net.py --config-file configs/Panoptic/odise_label_coco_50e.py --num-gpus 2 --amp

Appreciate any idea to solve this issue, thank you.