Currently, I collect activations from residual streams only. It shouldn't be that hard to take activations of the same shape from MLP-out and attention-out. This would give some insight into what MLP and attention sublayers are doing during model inference.
Currently, I collect activations from residual streams only. It shouldn't be that hard to take activations of the same shape from MLP-out and attention-out. This would give some insight into what MLP and attention sublayers are doing during model inference.