jbloomAus / SAELens

Training Sparse Autoencoders on Language Models
https://jbloomaus.github.io/SAELens/
MIT License
386 stars 106 forks source link

Move activation store to cpu #159

Closed tomMcGrath closed 4 months ago

tomMcGrath commented 4 months ago

Description

Currently the ActivationStore class stores the activation buffer (which is typically large) in the same location as the model; this location is normally VRAM which is a scarce resource.

This PR allows the ActivationStore to store the activation buffer on a different device, including CPU. The device to use is specified by the new act_store_device config entry, which defaults to using the same device as the model to ensure backwards compatibility.

The linked W&B dashboard below demonstrates that this change has no effect on throughput vs storing activations in VRAM for a typical workload and allows at least an 8x scaleup of activation buffer size.

Type of change

Please delete options that are not relevant.

Checklist:

You have tested formatting, typing and unit tests (acceptance tests not currently in use)

Performance Check.

If you have implemented a training change, please indicate precisely how performance changes with respect to the following metrics:

This is a performance improvement: increasing metrics is not the goal. This PR maintains throughput while freeing up VRAM: see W&B dashboard.

codecov[bot] commented 4 months ago

Codecov Report

Attention: Patch coverage is 90.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 67.13%. Comparing base (72179c8) to head (2accc81).

Files Patch % Lines
sae_lens/training/config.py 85.71% 0 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #159 +/- ## ========================================== + Coverage 66.96% 67.13% +0.17% ========================================== Files 19 19 Lines 1704 1710 +6 Branches 266 267 +1 ========================================== + Hits 1141 1148 +7 + Misses 506 504 -2 - Partials 57 58 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jbloomAus commented 4 months ago

Seems good. Do you have a moment to write a few tests? Bit nervous we could have a bug and not realise it here.

tomMcGrath commented 4 months ago

Sure, will add some to this pr

On Wed, 22 May 2024 at 15:01, Joseph Bloom @.***> wrote:

Seems good. Do you have a moment to write a few tests? Bit nervous we could have a bug and not realise it here.

— Reply to this email directly, view it on GitHub https://github.com/jbloomAus/SAELens/pull/159#issuecomment-2124876409, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALGU42CIS45XGRJR5Z36O3ZDSQJZAVCNFSM6AAAAABIDV4EYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRUHA3TMNBQHE . You are receiving this because you authored the thread.Message ID: @.***>

tomMcGrath commented 4 months ago

I've added some tests and fixed an issue in the test cacher + an issue the new tests exposed in my code.

jbloomAus commented 4 months ago

Awesome work! Thanks!