Closed jrodriguezpuigvert closed 1 year ago
Hi, I have an implementation question: Why do you use an nn.Identity() as the last activation at the end of the head?
Training Without that I am getting some Cuda issues.
Hi, I have an implementation question: Why do you use an nn.Identity() as the last activation at the end of the head?
Training Without that I am getting some Cuda issues.