heshenghuan / LSTM-CRF

A (CNN+)RNN(LSTM/BiLSTM)+CRF model for sequence labelling.:smirk:
140 stars 58 forks source link

What's the principle of the sum of embedding vector as feature template score ? #8

Open smy69 opened 7 years ago

smy69 commented 7 years ago

In hybrid_model.py, you add feature template, if I am right, the implementation is line 116,

scores = feat_sum + tf.matmul(outputs, self.W) + self.b

where feat_sum is defined as:

features = tf.nn.embedding_lookup(self.T, F)
feat_sum = tf.reduce_sum(features, axis=2)
feat_sum = tf.reshape(feat_sum, [-1, self.nb_classes])

It seems that feat_sum is just the sum of the embedding vector, so, what's the principle of this ?

heshenghuan commented 7 years ago

@smy69 Actually, the feat_sum is sum of weight of feature-functions(local functions).

In ME or CRF model, we defined a set of feature functions which usually are binary functions. It means if a feature function is activated, then it's weight will be added to an accumulated value, i.e. the score value of feature.

So this summation process can be viewed as multiplying a weight matrix by a sparse vector. And the sparse vector is the sum of one-hot represent feature function's index.

This is very similar to embedding looking up. Since sparse matrix multiplication costs resources very much, I used embedding_lookup instead.

Hopes it will be helpful for you.