Not an issue: How to adapt for multiple datasets?

windowshopr commented 5 years ago

I'm loving working with your project.

I was wondering if you might be able to help me with something. I'm wanting to adapt your model to be able to run over multiple datasets. I have lots of different .CSV files containing the 5 minute historical data of many different stocks, however each dataset file is not very big. Each one only contains one trading day's worth of historical data. I'm basically trying to implement your reinforcement learning model on a daytrading type basis.

So, how might one modify your model to be able to train on multiple different datasets at once, when each dataset is fairly small, depicting past stocks that were day traded on in the past? Could it somehow loop through every file in the train and test folders, or would it be best to compile it into one dataset, and somehow have the model recognize a different ticket symbol so that it doesn't try to learn a pattern of pricing if the prices for the different tickers are all over the place. Would love your feedback! Thanks!

windowshopr commented 5 years ago

I should add of course, that I've tried using just one dataset, of one stock ticker with 1 day's worth of 5 minute stock data in the model already. At first I get the following error:

File "C:\Users\...\env\TFTraderEnv.py", line 171, in reset
    self.current_tick = random.randint(0, self.df.shape[0] - 800)
  File "C:\Users\...\AppData\Local\Programs\Python\Python36\lib\random.py", line 221, in randint
    return self.randrange(a, b+1)
  File "C:\Users\...\AppData\Local\Programs\Python\Python36\lib\random.py", line 199, in randrange
    raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0,-770, -770)

This I'm assuming is because the dataset is just too small, so I changed the line in TFTraderEnv.py:

self.current_tick = random.randint(0, self.df.shape[0] - 800)

...to

self.current_tick = random.randint(0, self.df.shape[0] - 1)

...and then receive the following error:

File "C:\Users\windowshopr\Desktop\Python Scripts\Stock Market Prediction\Warrior Trading Reinforcement Learning Attempt\env\TFTraderEnv.py", line 208, in updateState
    self.closingPrice = float(self.closingPrices[self.current_tick])
IndexError: index 29 is out of bounds for axis 0 with size 29

Is there something else I might need to modify in order to make this work with a smaller dataset, or again, how could it be modified to train several datasets/one big dataset with multiple stocks in it. Thanks!!!

miroblog commented 5 years ago

How to train multiple stock ticker?
this repository seems most relevant https://github.com/wassname/rl-portfolio-management, it basically tries to allocate portfolio in to different buckets (cash, stock1, stock2, stock3 ...) .
change it to self.current_tick = 0, note that self.df.shape[0] should be bigger than window_size(30)

windowshopr commented 5 years ago

Thank you again for the help! I will look at that other repository for inspiration as well. Great application!

miroblog / tf_deep_rl_trader

Not an issue: How to adapt for multiple datasets? #3