[DOC] Use case for `retain_graph`

f-dangel / backpack

BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.

https://backpack.pt/

MIT License

549 stars 55 forks source link

[DOC] Use case for `retain_graph` #302

Closed f-dangel closed 1 year ago

f-dangel commented 1 year ago

Adds a tutorial for BackPACK's retain_graph option. It shows how to distribute the GGN diagonal computation of an auto- encoder architecture over multiple backward passes to reduce peak memory.

This use case recently came up in a discussion with @wiseodd on Laplace approximations for auto-encoders (or any large output neural network with square loss).