Closed jbloomAus closed 2 weeks ago
Attention: Patch coverage is 51.85185%
with 13 lines
in your changes missing coverage. Please review.
Project coverage is 59.58%. Comparing base (
33a612d
) to head (7e9e501
). Report is 1 commits behind head on main.
Files | Patch % | Lines |
---|---|---|
sae_lens/load_model.py | 0.00% | 7 Missing and 1 partial :warning: |
sae_lens/evals.py | 76.92% | 3 Missing :warning: |
sae_lens/training/sae_trainer.py | 33.33% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Description
I'm just working with some larger models and even though IO is still likely our bottleneck, there were a few easy wins with multiple GPUs I wanted to get out.
These were:
ActivationsStore
was running code withouttorch.no_grad()
. If the model was in eval mode this wouldn't be an issue but I've added these contexts to the key functions just in case.load_model
utility will now check if n_devices is set and force the model device to be "cuda". This is important for enabling the SAE to be on a device like "cuda:3" while the model is on devices "cuda:0" - "cuda:2".We should add these to the docs at some point but here's an example of using multiple devices for the model, SAE and activations store.
Type of change
Please delete options that are not relevant.
Checklist:
You have tested formatting, typing and unit tests (acceptance tests not currently in use)
make check-ci
to check format and linting. (you can runmake format
to format code if needed.)