orobix / fwdgrad

Implementation of "Gradients without backpropagation" paper (https://arxiv.org/abs/2202.08587) using functorch
MIT License
95 stars 7 forks source link

Implementation quick question #17

Open macrocredit opened 1 year ago

macrocredit commented 1 year ago

Hi - Great work!

I have one question:

_loss, jvp = fc.jvp(f, (tuple(params),), (vparams,))

Do you know why jvp is a scalar? I would have thought that this is a matrix. Also, is there a reason why we are calling tuple(params) instead of params?

Thank you.

FITZET commented 11 months ago

i also found that the jvp is a scalar, and i'm not sure how the jvp is calculated. i want a formula to show the computing flow layer by layer, do you know how to get the formula?