Closed nivibilla closed 1 year ago
Fixed by ddp_find_unused_parameters=False
Is the notebook broken or is it a different setting than the standard single GPU colab?
I'm not sure, I did it in databricks. So maybe it's a databricks issue. But I think it's because the system thinks there's multiple gpus even though there aren't. So having that flag fixes it. So it could just be a databricks issue.
None of the inputs have requires_grad=True. Gradients will be None warnings.warn("None of the inputs have requires_grad=True. Gradients will be None") RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
When enabling gradient checkpointing in the notebook trainer. It doesn't work