Implement Gramian based aggregators as such

TorchJD / torchjd

Library for Jacobian descent with PyTorch. It enables optimization of neural networks with multiple losses (e.g. multi-task learning).

https://torchjd.org

MIT License

151 stars 0 forks source link

Implement Gramian based aggregators as such #136

Closed PierreQuinton closed 1 month ago

PierreQuinton commented 1 month ago

This concerns:

UPGRAD
MGDA
DualProj
PCGrad
IMTL-G
CAGrad
Nash-MTL
Aligned-MTL

This is a strict improvement and is a first step towards Gramian based JD.

ValerianRey commented 1 month ago

Yes, for that we would need a third class hierarchy (the first is Aggregator, the second is _Weighting) that would be something like _GramianWeighting.

It would have the responsibility of giving weights (vector of shape [m]) from a Gramian (matrix of shape [m, m]).

I think I'd like to work on statefulness first, as it will also affect class hierarchies.

ValerianRey commented 1 month ago

I'm actually not sure about what to do for stateful aggregators, so we will probably do this first.