llSourcell / ethereum_future

This is the Code for "Ethereum Future Prices" by Siraj Raval on Youtube
303 stars 180 forks source link

what about the comment below you video? #5

Open lorrp1 opened 6 years ago

lorrp1 commented 6 years ago

"I love your videos Siraj, but this is very disappointing. Using a bidirectional network is beyond cheating. If you were to actually try to use this, the network requires past data AND future data as inputs to make a prediction, which won't exist yet. Also even if you use a regular LSTM, this type of scheme doesn't work. The network learns to predict the input at time t plus some noise as the output. It creates nice looking graphs but is useless in trading. I know this is a topic that will get you some views but the least you can do is mention that it is cheating and won't work at all before you waste peoples' time."

simonhughes22 commented 6 years ago

You don't understand his code. The bi-directional LSTM goes back and forth over the time window, which is always in the PAST relative to what he is predicting. I have worked extensively on these type of models in my PhD and there is nothing wrong with his code AFAICT. Yes a bi-drectional RNN runs forward and backwards on the data, but it depends what data it's looking at. Ensuring that data is in the past means this is not cheating nor is it unrealistic. Why do this then? Well the RNN can get more data by passing over the time series in both directions. It's looking into the past to determine which data points to remember as those are important for the prediction task. But when doing so it can only use the context of the ones it's already seen in the current pass. So it sees the world differently when iterating over the data in each direction. This is thus often much more effective when making predictions on time-series than a single forward pass.

simonhughes22 commented 6 years ago

@LorenzoPanico let me explain the code that does the data preparation:

    X_train = training_data[:, :-1]
    Y_train = training_data[:, -1]
    Y_train = Y_train[:, 20]

The data is 3D, first dimension is the day, the second is the time window upto that day, and the third are the features in the file, which includes the bitcoin price at index 20. The first line he's taking all rows and the entire time series upto but excluding the last data point (which is for the current day) and assigning that as the inputs to the RNN to train on. Then in the second line, where he starts to compute the labels, he simply takes the last data point from the time window, i.e. the current day. That is now a 2D array as the time series dimension has essentially been removed, and consist of a matrix of days (rows) by features (cols) for the current day (again excluded from the training inputs by line 1). In the last line he drops all features except the price, as that is what it is predicting, and those become the labels. Hope that now makes sense. There is no data leakage in the labels.

chouligi commented 6 years ago

@simonhughes22 simonhughes22 did you manage to find the csv that he uses in this code? Do you think that he probably merged different datasets to make this?

simonhughes22 commented 6 years ago

@chouligi i didn't spend time looking for it. From everyone's comments on the YT video, i think he merged it from several different data sources.

eshijia commented 6 years ago

As far as I know, many similar codes just let people understand how to use LSTM in fact. The prediction result is deceptive. I haven't found any successful solutions for time series prediction tasks with LSTM.

chouligi commented 6 years ago

@eshijia why do you say that the prediction result is deceptive? Moreover, if LSTMs are not successful in time series predictions, what could be a better alternative then?

eshijia commented 6 years ago

@chouligi A basic tutorial is here.

chouligi commented 6 years ago

Thanks for this @eshijia! I read the article, however can u indicate where in his code @llSourcell Siraj makes a single point-by-point prediction? I tested the code using an Ethereum dataset and I think it predicts at once for the whole testset. I must say that the results were far worse than Bitcoin due to lack of data, however I don't find the results deceptive.

dooley1001 commented 6 years ago

@simonhughes22 try run it and come back to me :)