notadamking / Stock-Trading-Visualization

A simple, yet elegant visualization of our stock trading RL agent environment.
MIT License
212 stars 110 forks source link

Model receiving most of the data points as 0. Affects _next_observation method. #3

Open renatodvc opened 5 years ago

renatodvc commented 5 years ago

I've opened an issue in Stock-Trading-Enviroment (link) that I belive also affects this repo [same class, more specifically, same method].

However, I'm opening this new issue, because I think the changes in StockTradingEnv class (present only in this repo) created a new problem.

  def _next_observation(self):
        frame = np.zeros((5, LOOKBACK_WINDOW_SIZE + 1))

        # Get the stock data points for the last 5 days and scale to between 0-1
        np.put(frame, [0, 4], [
            self.df.loc[self.current_step: self.current_step +
                        LOOKBACK_WINDOW_SIZE, 'Open'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        LOOKBACK_WINDOW_SIZE, 'High'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        LOOKBACK_WINDOW_SIZE, 'Low'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        LOOKBACK_WINDOW_SIZE, 'Close'].values / MAX_SHARE_PRICE,
            self.df.loc[self.current_step: self.current_step +
                        LOOKBACK_WINDOW_SIZE, 'Volume'].values / MAX_NUM_SHARES,
        ])

I'm not sure what was the reason to use np.zeros and np.put instead of directly using np.array like in Stock-Trading-Enviroment repo, but the way it is being used here, doesn't seem to achieve its purpose.

From what I gathered so far, and I might be wrong here, the purpose of the frame is to aggregate all past candle data (OHLCV) up to a 41 days, assuming LOOKBACK_WINDOW_SIZE is kept at 40. Ignoring the issue I already mentioned in the top of the post, the use of np.put here is causing the frame to only receive two data points, while the rest is kept at zero.

Return of the frame after the np.put:

[[0.002726 0.       0.       0.       0.0033   0.       0.       0.        0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.      ]
 [0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.      ]
 [0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.      ]
 [0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.      ]
 [0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.       0.      ]]

As you can see, the only values that were changed were the ones passed in the parameters as indices: np.put(frame, [0, 4], [ All the rest is kept at zero.