keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.97k stars 19.46k forks source link

LSTM smooth prediction, time-steps doubt #971

Closed santi-pdp closed 8 years ago

santi-pdp commented 8 years ago

Hi all,

I am working on a project where I have to perform a prediction of a sequence of real coefficients (normalized) based on an input sequence of vectors containing categorical and real values (i.e. From a text I extract, for each word, properties of that word and distances to other words, etc.). When I was using DNNs I just injected a windowed version of few input vectors taking some past and future samples (e.g. [v(-2),v(-1),v(0),v(1),v(2)]) and asked the net to predict the desired output coeff (c0). Now though, when working with LSTMs, I'm getting confused thinking I am structuring wrongly the data somehow, because the output I get over time is like a very smoothed version of what I had with DNNs, that had way more variance, something that is supposed to happen in the prediction, and not just giving something similar to a mean as with the current RNN version. So as far as I'm concerned, if I have a file with a feature vector per line with it's corresponding output (i.e. [[v(0), c0],[v(1),c1],...,[v(N),cN]]), I have to build a set of maxlen sequences of those inputs with fixed timesteps, such that the X input matrix to the model is: (number of samples in file, desired time steps, feature vector), giving me: [[[-1],[-1],[-1],v(0)], [[-1],[-1],v(0),v(1)], [[-1],v(0),v(1),v(2)], ...] where [-1] is a special padding vector of -1 with same length as the input feature vector. Moreover, the y matrix for the model would be: [c0,c1,c2,...] as I want the Nth coefficient predicted after the corresponding time-windowed input sample. Does anyone have any advice or clue on what can be happening for me to get a wrong prediction that seems to stick to the mean value? And a basic question to which I've arrived as I think that I may not have it as clear as I thought; what are these fixed time-steps provoking in the model? In training for instance, may I randomize the X matrix data sample rows? What would be the effect on the long context past of the LSTM?

Thank you

santi-pdp commented 8 years ago

I have solved the problem, it was related with normalization issues.