Some questions - Githubissues

Kiris-tingna commented 5 years ago

1. why use SpatialDropout1D not dropout ?
2. why right part use go_backwords=True not pad_sequence(padding='post', truncating='post')
3. why left part ended with $t$+(aspect words length) and right part start from left of $t$?
4.  ATAE paper mentioned 'tanh(Wx Hn+Wp r)' in your code using Activation('tanh')(Add()([v1, v2])), this need think?

AlexYangLi commented 5 years ago

Hi, @Kiris-tingna , thanks for asking! Here's my answer.

According to Keras' documentation:

SpatialDropout1D performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements.

I use SpatialDropout1D instead of Dropout because I want to consider each word embedding (ID feature maps) as a whole and drop or keep them together.

The right part use go_backwords=True so that the aspect words is at the end of sequence, which could better utilize the semantics of aspect when using the last hidden state of LSTM for sentimenti classification, as stated in Section 2.2 of paper TD-LSTM.
There are the preceding and following contexts around aspect words respectively. The input of left LSTM is the preceding contexts plus aspect words runing left to right. The input of right LSTM is the following contexts plus aspect words runing right to left.
In my code, I add dense layer to r and Hn , which means v1 = Wx Hn, v2=Wp r.

Kiris-tingna commented 5 years ago

thank you very much another question is if aspect contains two or more words then aspect embedding using flatten to contact them ，did you compare this with mean pooling method？i did think the second way make sense

AlexYangLi commented 5 years ago

Yes, I do use the average of aspect word embeddings to represent aspect embedding. You can find it in preprocess.py : The concatenated aspect word embeddings is only used when the model require a sequence input, like IAN model, which use a LSTM to encode aspect text seuqence.

Kiris-tingna commented 5 years ago

def split_into_left_and_right(samples, embedding, word_dict, labels, aspects):
    """
    每一个ids样本包含 1, word ids 2， template_index
    :param samples:
    :param embedding:
    :param word_dict:
    :return:
    """
    x_left, x_right = [], []
    y = []
    aspect = []
    for sample, label, asp in zip(samples, labels, aspects):
        # 句子最大长度
        max_length = len(sample)
        content_words = sample[0]
        template_ids = sample[1]
        # 占位符也就是aspect
        for aspect_index in template_ids:
            aspect_index = int(aspect_index)
            l = content_words[0: aspect_index]
            r = content_words[min(aspect_index+1, max_length):]
            # 占位符也变为0, 不在embedding 里的 不在词典里的变为 0, 这里在原始论文中否是从aspect word 开始算的
            ll = [word_dict[l_word] if l_word in embedding and l_word in word_dict else 0 for l_word in l] + asp
            rr = asp + [word_dict[r_word] if r_word in embedding and r_word in word_dict else 0 for r_word in r]
            x_left.append(ll)
            x_right.append(rr)
            y.append(label)
            aspect.append(asp)
    x_left, x_right, aspect, y = np.array(x_left), np.array(x_right), np.array(aspect), np.array(y)
    return x_left, x_right, aspect, y

explain: [abc$t$def, $t$ghi $t$ jkl] [aspect word] -1 can be process into three training samples: [abc aspect word][ aspect word def] -1 [def aspect word][ aspect word ghi ] -1 [ghi aspect word][ aspect word jkl] -1

AlexYangLi commented 5 years ago

Hi, @Kiris-tingna . In my code, I assume there is only one aspect in one sentence. I haven't check for multi-aspect. But thanks for your sharing!
BTW, I think there's a bug in your code, which is max_length = len(sample). Shouldn't it be max_length = len(sample[0]), cause sample[0] is the sentence, and len(sample) always returns 2. Right?

Kiris-tingna commented 5 years ago

yes thanks

AlexYangLi / ABSA_Keras

Some questions #1