llSourcell / Reinforcement_Learning_for_Stock_Prediction

This is the code for "Reinforcement Learning for Stock Prediction" By Siraj Raval on Youtube
641 stars 363 forks source link

Resolved issue around inability to evaluate and overflow in sigmoid. Also added a few lines that I missed in merge last night. #8

Open xtr33me opened 6 years ago

xtr33me commented 6 years ago

Was always getting a profit of 0 when evaluating model. This was primarily due to a "Buy" never occurring and therefore agent.inventory was always empty. So I set it so the first iteration a buy will occur and then the model will pick up from there. I'm thinking that in a future adjustment to this, we can infer the best time to buy in to the model based on the sliding window, or perhaps another means. For now, this at least allows for the evaluation to occur on other datasets.

Sigmoid was also overflowing when gamma (x) was higher than what math.exp could handle which on my system was around 700. The implementation used was one with which I found on SO.

alanyuwenche commented 5 years ago

Thanks for your sharing.

This modification can tackle no profit due to a "Buy" never occurring. But I don't think it really solve the core problem: why can't a trained agent take a proper(buy) action even in the trained data? I found this problem because of building a sell agent(attached code). Originally, the agent must have a “Buy” position, and then it can take “Sell” action. Likewise, we can easily modify the code to build an agent that it must have “Sell” position first. But I tried many efforts, the agents only take “Buy” action even if I forced it to sell at the first time. No matter how it shows good performance during training processes, it seems not to transfer to evaluation process.

If we can make it work, this example actually shows how we can deal with “Environment”. It is usually quite difficult to model environment in financial markets. agent_sell.zip