kkuette / TradzQAI

Trading environnement for RL agents, backtesting and training.
Apache License 2.0
165 stars 47 forks source link

PPO implementation #7

Closed talvasconcelos closed 5 years ago

talvasconcelos commented 5 years ago

Hi man, could you please help me out with PPO? I "ported" q-trader to tensorflow js. It's painfully slow and i'd like to try to implement a different policy to it. Could you please walk me through a simple implementation for trading porpuses (3 actions, 1 stock, etc...) of the PPO?

I'm not a Python coder, and my python reading skills are not that great and for i can't seem to fully understand your code to try and implement it on tfjs.

How does the policy gets updated, what parameters go into it, how much faster is it from q function, etc...

Thanks, Tiago

kkuette commented 5 years ago

Hi talvasconcelos, In fact, I haven't implemented any agent, i've barely made a wrapper. I use a library called tensorforce that use tensorflow. In term of speed, it does around 600it/s, so it's much faster than q-trader. As far i can tell, they use tensorflow monitored session.

I hope it helped.