Expose the parameter precopmute_grad. Setting to False, results in the averaging of the second order derivatives over the batches, which requires less memory (no need to keep the computation graph) but is slower in general.
Checklist
[ ] Wrote Unit tests (if necessary)
[x] Updated Documentation (if necessary)
[x] Updated Changelog
[ ] If notebooks were added/changed, added boilerplate cells are tagged with "tags": ["hide"] or "tags": ["hide-input"]
Description
This PR closes #497
Changes
precopmute_grad
. Setting to False, results in the averaging of the second order derivatives over the batches, which requires less memory (no need to keep the computation graph) but is slower in general.Checklist
Wrote Unit tests (if necessary)If notebooks were added/changed, added boilerplate cells are tagged with"tags": ["hide"]
or"tags": ["hide-input"]