Fixed bug, Tensorflow 1.12.0: attention weights remains equal 1 for context and aspects during training

lpq29743 / IAN

A TensorFlow implementation for "Interactive Attention Networks for Aspect-Level Sentiment Classification"

MIT License

99 stars 40 forks source link

I have experiment with IAN in AREkit framework for sentiment attitudes extraction (this implementation has been embedded into toolkit). The problem was that all the weights within a context/aspects remains equal 1. The latter leads to that only last peceptron layer changes during training process. All the other hidden states remains the same as i think due to the abscense of a variation and hence a gradient for back prop.

Clarifying axes from 0 (by default, by i -- batch) to 1 ( j -- context words) (ij in einsum function notation) fixed the problem.

Suppose that the latter led to worse results since implementation become updated.

lpq29743 / IAN

Fixed bug, Tensorflow 1.12.0: attention weights remains equal 1 for context and aspects during training #14