Refactors the Save Attention Maps script for plotting cross attention maps
Adds options for different attention map filtering (Per-Token Attention Maps, One-Hot Argmax maps)
Each plot shows the token index, the token id (tokenized word), and the word associated with the token id.
Adds a unique sequence number to the beginning of the saved files to prevent overwriting old files
Refactors the Save Attention Maps script for plotting cross attention maps Adds options for different attention map filtering (Per-Token Attention Maps, One-Hot Argmax maps) Each plot shows the token index, the token id (tokenized word), and the word associated with the token id. Adds a unique sequence number to the beginning of the saved files to prevent overwriting old files