Going through your code for TradingEnv, something seems odd to me, the function step(action) is supposed to return the next observation right?
In step(action), if it is using prices[self._current_tick] to calculate the reward given by action, the it should return the observation which contains the data point at self._current_tick right? In that case shouldn't get_observation be
Going through your code for TradingEnv, something seems odd to me, the function step(action) is supposed to return the next observation right?
In step(action), if it is using prices[self._current_tick] to calculate the reward given by action, the it should return the observation which contains the data point at self._current_tick right? In that case shouldn't get_observation be
instead of
?
Thank you for your help