SeanNaren / warp-ctc

Pytorch Bindings for warp-ctc
Apache License 2.0
756 stars 271 forks source link

Use grad_output in pytorch binding #170

Closed stanleee5 closed 4 years ago

stanleee5 commented 4 years ago

It seems that _gradoutput of torch.autograd.Function saves the gradient of the layer from the front of it. When i scaled CTC loss (like using AMP), i had to scale the graident too.

SeanNaren commented 4 years ago

Yeah I think it's high time we introduce this fix, and fix things upstream when they break.

If people want the original behaviour where gradients are not scaled, you will need to remove any averaging done on the loss as this will automatically apply to the gradients now!