Separate forward/backward pass

Ebanflo42 commented 5 months ago

We need to carefully design the API such that the user has access to both an executable that returns the desired diff calls (for training) and an executable that returns everything except that (for testing).

This is part of a larger array of issues that will emerge from the need to embed contexts in other contexts (for example, separating the optimizer step, or designing recurrent architectures). In this case it might make sense to allow the user to design a forward pass context which doesn't take labels or output gradients, then allow them to clone that context and recover all desired node identifiers in order to create another context that takes both labels and inputs and outputs both predictions and loss and gradients. Then both executables can be used separately.

Ebanflo42 commented 5 months ago

Also would make sense to add gradient clipping as a part of the solution to this issue.

Ebanflo42 commented 5 months ago

Scratch that, gradient clipping should be an extra feature in the core autodiff engine, i don't know why I was thinking these things are related.

unda-ml / unda

Separate forward/backward pass #72