jbloomAus / SAELens

Training Sparse Autoencoders on Language Models
https://jbloomaus.github.io/SAELens/
MIT License
490 stars 127 forks source link

Supply `device` to `SAEConfigLoadOptions` #347

Closed callummcdougall closed 1 month ago

callummcdougall commented 1 month ago

This is a bug which was introduced last week. The issue seems to be that the SAEConfigLoadOptions class was introduced which contains the device field (default None), and this field wasn't set when device is not None. Before this change, the config was just loaded directly from HuggingFace without overriding its device value (which on HuggingFace presumably is "cuda").

Would be great if this PR could be merged soon if possible, and I can try to add tests for this kind of failure mode afterwards.

chanind commented 1 month ago

Thanks for finding this! just merged as this is a live bug. If you can add a test that would prevent this regression in the future that would be great as well

callummcdougall commented 1 month ago

Yep will add a test on my todo list for this week!