In class Model, the output is decision/action , a list of three. What's the actions in your design correspondingly? Hold == 0 , Buy == 1, Sell == 2?
In Agent.act(), np.argmax(decision[0]) is returned. Does decision[0] holds the 3 likelihood of hold/buy/sell? Am I correct?
For the weights[2] (or buy) , it controls the trade quantity as the 2nd output layer. Can I say you let the agent suggest trade size based on the price delta between two price window? Is the value range pre-determined ? e.g. 1-1000 or a black box ?
Hi Husein,
Thanks for sharing the great works. I am newbie to AI, can understand ES in general.
When I read your code, I have few questions. Grateful if you can shed some light:
Best Regards, Steve