Training takes too long

ai-safety-foundation / sparse_autoencoder

Sparse Autoencoder for Mechanistic Interpretability

MIT License

171 stars 39 forks source link

Open BiEchi opened 7 months ago

BiEchi commented 7 months ago

It seems to take 40h to run the training, and it takes only 600M CUDA Memory. Is it normal?

Activations trained on: 0%| | 4997120/1000000000 [12:35<40:55:48, 6752.73it/s, stage=train]

BiEchi commented 7 months ago

Let me rephrase my question: what's the file that deals with multi-GPU training for the autoencoder?