lpq29743 / IAN

A TensorFlow implementation for "Interactive Attention Networks for Aspect-Level Sentiment Classification"
MIT License
99 stars 40 forks source link

Fixed bug, Tensorflow 1.12.0: attention weights remains equal 1 for context and aspects during training #14

Closed nicolay-r closed 4 years ago

nicolay-r commented 4 years ago

I have experiment with IAN in AREkit framework for sentiment attitudes extraction (this implementation has been embedded into toolkit). The problem was that all the weights within a context/aspects remains equal 1. The latter leads to that only last peceptron layer changes during training process. All the other hidden states remains the same as i think due to the abscense of a variation and hence a gradient for back prop.

Clarifying axes from 0 (by default, by i -- batch) to 1 ( j -- context words) (ij in einsum function notation) fixed the problem.

Suppose that the latter led to worse results since implementation become updated.

nicolay-r commented 4 years ago

Before (only last layer updates) ian-ends After (all the layers update its values during training) ian-ends

First two matrices are context and aspect weights respectively. I used a pair aspects (Object/Subject), tested in sentiment attitudes extraction task, RuSentRel dataset.