Round negligible feature activation floats to 0.0

DavidUdell / sparse_circuit_discovery

Circuit discovery in GPT-2 small, using sparse autoencoding

MIT License

7 stars 1 forks source link

Round negligible feature activation floats to 0.0 #63

Closed DavidUdell closed 8 months ago

DavidUdell commented 8 months ago

Some of the really funky features (4.43, 4.267, 4.281) might be noise from the computations. I don't want to erase any data, but it might make sense to use a tolerant torch comparison operation rather than value == 0.0.