When I ran evaluate ion single GPU it gave me out of memory CUDA error (for torch.cuda.synchronize()). So I added one more GPU in the default config, and it gives me the below issue now.
File "actionformer_release/libs/utils/train_utils.py", line 391, in valid_one_epoch
output = model(video_list)
return forward_call(*input, **kwargs)
raise RuntimeError("module must have its parameters and buffers "
RuntimeError: module must have its parameters and buffers on device cuda:3 (device_ids[0]) but found one of them on device: cpu
The current evaluation batch size is one. i.e, using multiple GPUs won't have some memory reduction. Since mixed/half precision will hurt the model's performance, you can try CPU inference.
When I ran evaluate ion single GPU it gave me out of memory CUDA error (for torch.cuda.synchronize()). So I added one more GPU in the default config, and it gives me the below issue now.
File "actionformer_release/libs/utils/train_utils.py", line 391, in valid_one_epoch output = model(video_list) return forward_call(*input, **kwargs) raise RuntimeError("module must have its parameters and buffers " RuntimeError: module must have its parameters and buffers on device cuda:3 (device_ids[0]) but found one of them on device: cpu