richliao / textClassifier

Text classifier for Hierarchical Attention Networks for Document Classification
Apache License 2.0
1.07k stars 379 forks source link

Not able to train HAN because of the following error. #32

Open kk54709 opened 5 years ago

kk54709 commented 5 years ago

sentence_input = Input(shape=(MAX_SENT_LENGTH,), dtype='int32') embedded_sequences = embedding_layer(sentence_input) l_lstm = Bidirectional(GRU(100, return_sequences=True))(embedded_sequences) l_att = AttLayer(100)(l_lstm) sentEncoder = Model(sentence_input, l_att)

review_input = Input(shape=(MAX_SENTS, MAX_SENT_LENGTH), dtype='int32') review_encoder = TimeDistributed(sentEncoder)(review_input) l_lstm_sent = Bidirectional(GRU(100, return_sequences=True))(review_encoder) l_att_sent = AttLayer(100)(l_lstm_sent) preds = Dense(2, activation='softmax')(l_att_sent) model = Model(review_input, preds)

model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['acc'])

print("model fitting - Hierachical attention network")

Error ValueError: Dimensions must be equal, but are 15 and 100 for 'att_layer_10/mul' (op: 'Mul') with input shapes: [?,15], [?,15,100].

cjopengler commented 5 years ago

I got the same error

cjopengler commented 5 years ago

I found that in AttLayer change code from

" def compute_mask(self, inputs, mask=None): return mask " to

def compute_mask(self, inputs, mask=None):
    return None

it will be ok. It means that should not support mask.

kk54709 commented 5 years ago

Solved the issue. Thanks @cjopengler .

cjopengler commented 5 years ago

Solved the issue. Thanks @cjopengler .

But have you found that 'l_att = AttLayer(100)(l_lstm)' the first AttLayer no error, but the second 'l_att_sent = AttLayer(100)(l_lstm_sent)' at computing mask.

kk54709 commented 5 years ago

Well, I'm still experimenting with it. But earlier I was facing some issue with the same and now it is working fine.

MingleiLI commented 5 years ago

I have the same problem and changing the code as @cjopengler says works.

mingkin commented 5 years ago

Solved the issue. Thanks @cjopengler

980202006 commented 5 years ago

it is because of the function timedistributed previous_mask <tf.Tensor 'time_distributed_14/Reshape_2:0' shape=(?, 15, 100) dtype=bool> the methods of resolution can be refer to https://blog.csdn.net/songbinxu/article/details/80242211