I have trained the market_pg.py for 1000 episodes. But to evaluate with new data (2015-08---2016-09), the cumulative reward is nearly zero with whichever the stock chosen.
So I want to ask how do you calculate the revenue per episode? Does the cumulative reward means the same thing?
I have trained the market_pg.py for 1000 episodes. But to evaluate with new data (2015-08---2016-09), the cumulative reward is nearly zero with whichever the stock chosen. So I want to ask how do you calculate the revenue per episode? Does the cumulative reward means the same thing?