hi, it was a pleasure to read your paper. I had some problems reading your code. For example, in the first figure, I noticed that you seem to have extracted three attention vectors, att, original_att and w2v_att. The third one I can understand is that you get the class attribute through w2v? But your paper seems to use GloVe. And can you tell me the difference and meaning of att and original_att? They both seem to be (312,200). I also found class description files in the cub public dataset What is their difference?And why would you read three attribute-based attentions?
hi, it was a pleasure to read your paper. I had some problems reading your code. For example, in the first figure, I noticed that you seem to have extracted three attention vectors, att, original_att and w2v_att. The third one I can understand is that you get the class attribute through w2v? But your paper seems to use GloVe. And can you tell me the difference and meaning of att and original_att? They both seem to be (312,200). I also found class description files in the cub public dataset What is their difference?And why would you read three attribute-based attentions?