[rllib] How to load pre-trained PPO model to evaluate on slightly different environment

ray-project / ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Apache License 2.0

33.11k stars 5.6k forks source link

Hello,

I have a trading environment that and PPO agent with custom tf/torch model (I can use both). I use a trainable object which has a step method first train on training set and then validate the trained model on validation set.

To validate the trained model, I need to change timesteps and the dataset used. I tried to achieve this by:

-restoring saved model and changing the environment parameters to allow environment to change dataset and timesteps but ended up with the same training environment -writing to a text file that changed between train/validate on trainable step to send necessary info to the environment to update dataset and timesteps, again ended up with same environment.

Is there any way to achieve this? I read about export_model method that it has some troubles to maintain learnt behaviour. Stucked here for 3 days and any help would be great.

ray-project / ray

[rllib] How to load pre-trained PPO model to evaluate on slightly different environment #11156