wagenaartje / neataptic

:rocket: Blazing fast neuro-evolution & backpropagation for the browser and Node.js
https://wagenaartje.github.io/neataptic/
Other
1.18k stars 279 forks source link

Define Output #108

Open talvasconcelos opened 6 years ago

talvasconcelos commented 6 years ago

Hi, just found this and i'm very interested in testing with stock OCHL data. Want i want to achieve is predict if a stock will have at least a 5% rise, in price, in the next 8 hours, for example.

So i have the normalized dataset for testing for one stock, it's 5800 periods (5 min candles) with open, close, high, low, volume and base volume. this would be the input, how do i set the output for training porpuses?

I want to feed the data in a lstm for this prediction. Can anyone help? BTW, i'm a deep learning noob so, go easy!

Thanks, Tiago

wagenaartje commented 6 years ago

If you're using a LSTM, you always set the output of a training case to be the input of the next training case.

So if the value of my stocks would be: 5 -> 4 -> 2 -> 5 I have the following training data:

So the training set for Neataptic would then be (has to be normalized):

trainingset = {
  { input: [5], output: [4]},
  { input: [4], output: [2]},
  { input: [2], output: [3]}
}

If you have anymore questions, feel free to ask.

PS: I have been doing stock predictions on the data you're referencing using my lib stocks.js and this library. It could spot very small re-occuring patterns, but I was unable to successfully predict over bigger amount of time. You really need to have more input, e.g. recent good or bad new would be very effective.

talvasconcelos commented 6 years ago

Hi man, thanks for the reply. I'll try this approach too but, for now i don't want to predict the next value, or closing price, i want the prediction that in an X amount of time the stock will move 5% up. I'm trying to model my data for the input training data, i'll keep you posted. I don't need it to be lstm, i just would like it to work...

For now i have the input done with, open, close, volume, rsi, ema10 (for 12 periods, 5 min candles). I'm working on the output by making an evaluation if the in the next 48 candles the closing price is 5% above the closing price on the input return true, else false. Will this work?

talvasconcelos commented 6 years ago

Hi @wagenaartje , i'm a complete noob in machine learning... I want to "teach" a network to read charts, as i said above, i don't want it to predict prices on a direct way. I want it to predict if a stock will make a X% move in Y amount on candles. That beeing said, what is the best way to achive that? I'm trying to feed the network with indicators and close price (since the values are normalized, the network learns changes in percentage and not value, right?) and setting the output true or false based on the train data if the move did happen.

If i'm thinking right, what i need to do is feed the network with a sequence of let's say 6 candles and get a prediction of 6 in the future, then just check if the prediction has that X% move up.

How can i make that sequence to sequence LSTM network or NARX or whatever is best suited? Is this what the batch is for? Can i feed 6 candles (RSI, EMA, close price and volume) and expect 12 or 20 or 40 candles in a prediction?

Please if anyone could help out with this... all examples i see are in Python!!

wagenaartje commented 6 years ago

since the values are normalized, the network learns changes in percentage and not value, right?

Internally the network 'knows' nothing. You can input the values in whatever form you want, the neural network will learn the relationships. Setting the output to true or false sounds good.

What exactly do you mean by candles? You should basically create a data set that does exactly how you want it. The network will learn the relationships in the data set and will be able to predict outputs of future data inputs. However, the inputs should always be of the same form.

You basically just provide a data set for time t where all the inputs are certain values related to the stock at time t and you provide the output as true when a stock rise of 5% happens in between t and t+8, otherwise false.

Please post a snippet of how you are trying to feed the data so I can provide help!

talvasconcelos commented 6 years ago

Here's the inputData i'm getting into the network:

 {
  "open": 7138,
  "close": 7125,
  "ROC": 0.1122807022,
  "time": "2017-11-09T16:00:00",
  "RSI": 0,
  "EMA_10": 7124.3,
  "RVOL": 11.18203074,
  "output": 7102.999999
 },
 {
  "open": 7121.03501411,
  "close": 7102.999999,
  "ROC": 0.5209066733,
  "time": "2017-11-09T16:05:00",
  "RSI": 55.08,
  "EMA_10": 7122.063636,
  "RVOL": 19.67815571,
  "output": 7126
 },
 {
  "open": 7102.999999,
  "close": 7126,
  "ROC": 0.1543642997,
  "time": "2017-11-09T16:10:00",
  "RSI": 59.85,
  "EMA_10": 7121.506612,
  "RVOL": 20.94756647,
  "output": 7137
 },
 {
  "open": 7121,
  "close": 7137,
  "ROC": 0.1821493625,
  "time": "2017-11-09T16:15:00",
  "RSI": 59.88,
  "EMA_10": 7123.596319,
  "RVOL": 9.9236569,
  "output": 7139.99999998
 },
 {...
}]

This is getting normalized with a package, Normalizer, and the actual data that is fed to Neataptic looks like this:

[
 {
  "input": [
   0.30188,
   0.300535,
   0.440454,
   0,
   0.280622,
   0.017709
  ],
  "output": [
   __0.296466__
  ]
 },
 {
  "input": [
   0.298753,
   0.296466,
   0.460759,
   0.63691,
   0.280214,
   0.031693
  ],
  "output": [
   0.30072
  ]
 },
 {
  "input": [
   0.295429,
   0.30072,
   0.442545,
   0.692068,
   0.280112,
   0.033783
  ],
  "output": [
   0.302755
  ]
 }

At the moment i'm trying just price prediction, then i'd try and predict the next X prices and check if the there is a 5% rise. This is the output i'm getting using LSTM(6, 7, 1), with 1000 iteractions:

[
 {
  "input": 9540.14156506,
  "expected": 9547.997965120001,
  "output": 9792.924037759249
 },
 {
  "input": 9547.997965120001,
  "expected": 9547.997965120001,
  "output": 9854.019147896204
 },
 {
  "input": 9547.997965120001,
  "expected": 9520.00041556,
  "output": 9810.051919794365
 },
 {
  "input": 9520.00041556,
  "expected": 9479.772186760001,
  "output": 9744.689786246094
 },
 {
  "input": 9479.772186760001,
  "expected": 9499.99984858,
  "output": 9698.919520395779
 },
 {
  "input": 9499.99984858,
  "expected": 9501.00014728,
  "output": 9675.240217598222
 },
 {
  "input": 9501.00014728,
  "expected": 9499.99984858,
  "output": 9646.688965513822
 },
 {
  "input": 9499.99984858,
  "expected": 9481.999879,
  "output": 9628.623816886993
 },
 {
  "input": 9481.999879,
  "expected": 9450.00113464,
  "output": 9603.011417013062
 },
 {
  "input": 9450.00113464,
  "expected": 9450.20119438,
  "output": 9581.471431529419
 },
 {
  "input": 9450.20119438,
  "expected": 9430.00056766,
  "output": 9568.824566946361
 },
 {
  "input": 9430.00056766,
  "expected": 9400.662077140001,
  "output": 9558.361519496448
 },
 {
  "input": 9400.662077140001,
  "expected": 9450.00113464,
  "output": 9543.17865351095
 },
 {
  "input": 9450.00113464,
  "expected": 9449.00083594,
  "output": 9560.930809793392
 },
{...}]

Inittialy what i was trying to do was to feed it the same input above but for t, t+1, t+3,..., t+6 to predict if this can produce a 5% rise in price. I can't seem to figure out how to feed the network like that, how many inputs? is it done like :

{input: [price(t-6), ind1(t-6), ind2(t-6)], [...], [price(t), ind1(t), ind2(t)], output: true/false}

It's very tricky as i don't want the network to predict "price" for a specific stock, i want it to learn to "read" the indicators and predict a move on any stock i run activate on... no matter if the price is $10000 or $0.1...

talvasconcelos commented 6 years ago

Been playing around with Neataptic, this are some results i'm getting from the test data...

LSTM: lstm NARX: narx GRU: gru

what seem weird is that the values in the beggining are way off (except on GRU, just the couple first ones) but then tend to be quite accurate. What am i doing wrong?

I'm using some iterations from this:

const network = new nea.architect.GRU(5, 6, 1)

network.train(trainData, {
  log: 100,
  rate: 0.1,
  iterations: 1000,
  clear: true,
})

//test data is the last 50 inputs from the original data, the train data has this values removed
testData.map((cur, i) => { 
  let output = network.activate(cur.input)
  return output
})
thegamecat commented 6 years ago

Is that predicting?

talvasconcelos commented 6 years ago

Blue is what was expected from the activate(), orange is what the AI predicted.

thegamecat commented 6 years ago

predicted when in the series though?

talvasconcelos commented 6 years ago

Sorry i don't get your question, i'm an AI noob!

thegamecat commented 6 years ago

Ah ok sorry.

What I mean is if you are at say tick 10, is your prediction tick 11 or further into the future?

talvasconcelos commented 6 years ago

Yes it's predicing 1 ahead from the series. I should have tried to feed it it's own output maybe.

testData.map((cur, i) => {
  return network.activate(cur.input)
})

where testData is the last 50 inputs from the dataset, not used in training.