Open Zlatan-Ibrahi opened 9 months ago
I would like to analyze the attention map of my own trained model, but I am not very clear about some details. For example, do we take the average of the attention maps across multiple heads? Could you provide the code for this?
same question, any solutions?
I would like to analyze the attention map of my own trained model, but I am not very clear about some details. For example, do we take the average of the attention maps across multiple heads? Could you provide the code for this?