[Proposal] Rename `l1_coefficient` to `sparsity_coefficient`

jbloomAus / SAELens

Training Sparse Autoencoders on Language Models

MIT License

481 stars 127 forks source link

Proposal

Now that we support JumpReLU training, the l1_coefficient is confusing since jumprelu use a l0 loss, not l1, for training. We should rename this parameter to sparsity_coefficient since it is a coefficient used to generally promote sparsity. We should also rename l1_warmup_steps to sparsity_warmup_steps.

Motivation

It is confusing to see l1_coefficient used for JumpReLU training which doesn't use L1 loss.

Alternatives

Alternatively, we could add a separate l0_coefficient / l0_warmup_steps which is only used for jumprelu training and error if l1_coefficient is provided. This would also potentially allow training a jumprelu with both L0 and L1 loss if desired.

Checklist

[x] I have checked that there is no similar issue in the repo (required)

jbloomAus / SAELens