Open ezelikman opened 3 years ago
In the original implementation, the evaluation is performed with gradients when finetuning. This can be corrected by wrapping the evaluation in a torch.no_grad()
In the original implementation, the evaluation is performed with gradients when finetuning. This can be corrected by wrapping the evaluation in a torch.no_grad()