I have a question on your backward cuda code.
I could derive the first equation // grad[i] = grad_weights[i] T[i] - back_cum / (1-alpha[i] + 1e-10)
but I really dont know how to get second equation // back_cum += grad_weights[i] weight[i]
what does it meanning for? many thanks!
I have a question on your backward cuda code. I could derive the first equation // grad[i] = grad_weights[i] T[i] - back_cum / (1-alpha[i] + 1e-10) but I really dont know how to get second equation // back_cum += grad_weights[i] weight[i]
what does it meanning for? many thanks!