Open gao27024037 opened 5 years ago
Hi there,
You asked, "NDX should be calculated by these stock prices, isn’t it? " The answer is:
I hope that answers your question. If not, send me a response.
-Bill
Thank you for your answer. But I still have a question.
I can understand that the next NDX is predicted by the previous prices of the components of NDX by storing the information in encoder-decoder. But in the CSV file of stock, on any row, the price of NDX and the price of the components represent the same time or same day, which means the next prices of the components of NDX are contained in the input X and the code did the same. So I cannot understand why the input X with components at time T can predict the output Y with NDX at time T.
Hi again,
I'll need a day or so to look through the da_rnn code vs the equations for the decoder (in the original paper) so that I can show you the exact code lines where y values at time T are paired with encoded sequences of X values up to T-1.
Thx, -Bill
Thank u. I will be appreciated for u taking the time from your busy schedule to help me.
I believe the confusion of the da_rnn model results from this specific line (line 72 in the decoder cell of the jupyter notebook da_rnn_from_csv.ipynb.
y_tilde = self.w(torch.cat((context, y_history[:, t].unsqueeze(1)), dim = 1)) # batch_size, 1
This line implements equation 15 in the original paper (https://arxiv.org/pdf/1704.02971.pdf). I believe the authors used the previous history of y values to enhance the decoder's ability to select the most relevant parts of the encoder input. It's like the authors were regressing the future value of NDX using both the previous values of NDX's components and the previous values of NDX itself.
After equation 15, equation 16 runs an LSTM using the previous hidden state of the decoder LSTM and this new input called y_tilde
. This LSTM is run for each time step t
in order to build up the full hidden
variable.
self.lstm_layer.flatten_parameters()
_, lstm_output = self.lstm_layer(y_tilde.unsqueeze(0), (hidden, cell))
# ********************** Eqn. 16: LSTM **********************
self.lstm_layer.flatten_parameters()
_, lstm_output = self.lstm_layer(y_tilde.unsqueeze(0), (hidden, cell))
# ********************** Eqn. 16: LSTM **********************
# update values
hidden = lstm_output[0] # 1 * batch_size * decoder_hidden_size
cell = lstm_output[1] # 1 * batch_size * decoder_hidden_size
hidden
will finally be used in equation 22, to produce a prediction.
y_pred = self.fc_final(torch.cat((hidden[0], context), dim = 1))
return y_pred
In da_rnn's method train_interation
, the next (and maybe most important) lines of code show how only future values of y are used to actually perform the desired regression. In those lines, the future value of y (the variable y_true
) and the decoder output y_pred
are used as inputs to the loss function, which is followed by back-propagation:
y_pred = self.decoder(input_encoded, Variable(torch.from_numpy(y_history).type(torch.FloatTensor)))
y_true = Variable(torch.from_numpy(y_target).type(torch.FloatTensor)).reshape(y_target.shape[0],1)
loss = self.loss_func(y_pred, y_true)
loss.backward()
I believe that you will see that the y_true
get's constructed from y_target
, which contains values of y from time T
, not time T-1
.
Maybe you can review this code and see if my explanation makes sense to you. If not, or if you have any more questions, feel free to write back.
-Bill
Oh, I got your point, I see your code y_history
is 1
to T-1
and y_target
is T
.
I think my problem is solved. Thank you!
But I read the paper again too and find the problem may be brought by the original paper, the original paper's NARX model function is below: There are X_T and y_tlide_T at the same time definitely (no matter what code it is).
So I read the NARX model again and asked my supervisor. He told me the model is commonly used in Control Engineering and the y_T and X_T are not conflicted because y_T cannot be calculated by X_T. The sentence in NARX:
which relates to the fact that knowledge of other terms will not enable the current value of the time series to be predicted exactly.
I think it's really useful. Considering the difference of features of the SML 2010 dataset (the temperature forecasting) between features of the NASDAQ 100 Stock dataset. In my humble opinion, the original paper's model can apply in temperature data rightly, but it's not properly to apply in stock data. maybe it is not our fault.
No matter what, Thank you very much.
You are welcome.
On another note, I have implemented several LSTM based neural networks that essentially perform regression on financial data. The neural networks always return results that basically prove that financial time series behave as Random Walks/Martingales.
When you zoom in on a graph of the predictions vs actuals in the test sets of these nn's, you will see that the change in the predictions always predict the previous actual change.
Any organization that has made money using these neural network technologies must also be using other data non-market data (twitter feeds, etc), as well as using more granular time series (minute data, or even market depth bid/ask data).
Good luck with your studies, -Bill
Thank you for helping me, your codes are worthy to learn!
hello, your code in Class da_rnn init(), the code of getting data is below, why do you use companies' stock price to predict NASDAQ-100 Index? especially ticker='NDX' in function‘s brackets and
self.X = df_dat.loc[:, self.x_columns].as_matrix() self.y = np.array(df_dat[ticker])
NDX should be calculated by these stock prices, isn’t it? why u have to learn the calculation formula by RNN? The DA-RNN paper gives a time series predicting model, right? But where is your time series predicting? I am confusion.
That's what I found when I read the code repeatedly, If I got wrong or missed something, please tell me. Thank you.