ApolloResearch / rib

Library for methods related to the Local Interaction Basis (LIB)
MIT License
2 stars 0 forks source link

Negative variance iff edge ablations in Split LN #318

Open stefan-apollo opened 8 months ago

stefan-apollo commented 8 months ago

Because of our RIB rotations, the variance node can experience arbitrary changes, including becoming negative.

"Fixes" include var = torch.abs(var). (This is actually fine -- if var becomes negative then even abs(var) is gonna be a nonsense value and the loss is gonna be terrible still.)

Think about what is least terrible and implement.