Creating a custom Environment for Reinforcement learning (RL) using VectorBT

I am trying to build a custom environment that will be used for training an RL trading agent. VectorBT provides a lot of functionalities for building portfolios, so I want to build on top of it. In RL, the agent comes up with an action that it thinks will maximize return moving forward. So the environment simulates based on a single action and it continues till we reach the end date. So I can use vbt.Portfolio.from_orders to build a portfolio, but it will be for each time step and there will be multiple portfolios for the entire episode. Is there a mitigate this design problem using VectorBT, like having an online or accumulating portfolio? As the RL trading agent has to observe the next portfolio value to take a decision, so it needs to be an updating portfolio. I am happy to answer any further questions Thanks in advance.

polakowo / vectorbt

Creating a custom Environment for Reinforcement learning (RL) using VectorBT #469