RobertCsordas / linear_layer_as_attention

The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention".
16 stars 1 forks source link