keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.97k stars 19.46k forks source link

Is there any change in TimeDistributedDense? #568

Closed kgzy closed 7 years ago

kgzy commented 9 years ago

I used TimeDistributedDense to label sequeces in each time before:

model.add(Embedding(wordlen,worddim)) model.add(GRU(worddim,128,return_sequnces=True,activation="tanh")) model.add(TimeDistributedDence(128,3,activation="tanh")) model.add(Activation('time_distributed_softmax'))

It works well before. But it always predict the same label when I update my keras yestoday? Is there any change? I didn't find any solution about this in documentation and issues before.

Thank you

lemuriandezapada commented 9 years ago

'time_distributed_softmax' if I recall right is deprecated. There might be an issue with the axis allignment. just call regular softmax.

kgzy commented 9 years ago

I changed the 'time_distributed_softmax' to 'softmax', but the problem still exist.

model.add(Embedding(wordlen,worddim)) model.add(GRU(worddim,128,return_sequnces=True,activation="tanh")) model.add(Dropout(0.3)) model.add(TimeDistributedDence(128,3,activation="tanh")) model.add(Activation('softmax'))

Is there any error in my code? I'm confused because It works well in earlier keras.

fchollet commented 9 years ago

So what's your error?

kgzy commented 9 years ago

There is no exception. The output layer is a three category classification ( [1,0,0], [0,1,0], [0,0,1] ). I get a good performance using earlier keras. After I updated keras, it classify all nodes to the same class([1,0,0]).

fchollet commented 9 years ago

Maybe try to revert back to various earlier versions of Keras until you can identify which commit was the cause. I'd be interested in the results. Right now I'm very skeptical that the issue is about Keras.

wxs commented 9 years ago

Actually there might be an issue which I'm still investigating. I think that a bug snuck in to do with this nonzero() call (being discussed over in #573 for other reasons) that is joining the time and datapoint dimensions, and therefore screws up the time_distributed_softmax objective.

This line:

masked_y_true = y_true[weights.nonzero()[:-1]]

is supposed to just subset the y array, but it is also reshaping it from (a, b, c) to (a*b, c). Working on a fix.

wxs commented 9 years ago

Sorry please disregard my earlier comment, I don't think my other issue has anything to do with that :)

kgzy commented 9 years ago

@wxs Thanks. I want to use bidirectional RNNs. Did it necessary to the bidirectional RNNs. And which commit is it?