[Question] - Why different results each time loaded model is run?

snafu4 commented 2 years ago

Any ideas why the code below (it is basically the same as the code in the README.md file) would return different results each time it is run after the model has been trained? The trained model is no longer being updated and the input (trades) do not change between runs.

` model = A2C.load(modelName, env)

observation = env.reset()
while True:
    action, _states = model.predict(observation)
    observation, reward, done, info = env.step(action)

    state = env.render()
    z = calcsomevalue(state)
    if done:
       x  = getsomevalue(z)
       y = calsomevalue(z)
        print(<results>)
        break

`

snafu4 commented 2 years ago

clarification of my question


import gym
from gym_mtsim import (
    Timeframe, SymbolInfo,
    MtSimulator, OrderType, Order, SymbolNotFound, OrderNotFound,
    MtEnv,
    FOREX_DATA_PATH, STOCKS_DATA_PATH, CRYPTO_DATA_PATH, MIXED_DATA_PATH,
)
from stable_baselines3 import A2C

env = gym.make('forex-hedge-v0')

model = A2C('MultiInputPolicy', env, verbose=0)
model.learn(total_timesteps=1000)

observation = env.reset()
while True:
    action, _states = model.predict(observation)
    observation, reward, done, info = env.step(action)
    if done:
        break

modelName = 'model_test'
model.save(f'{modelName}.zip')

env.render('advanced_figure', time_format="%Y-%m-%d")

### 
# running the loop below should not result in different results each time, should it...
# ... but it does!  What am I missing here?
### 
model = A2C.load(modelName)

for i in range(5):
    observation = env.reset()
    while True:
        action, _states = model.predict(observation)
        observation, reward, done, info = env.step(action)
        if done:
            break

    env.render('advanced_figure', time_format="%Y-%m-%d")
'''

snafu4 commented 2 years ago

model.predict(observation, deterministic=True) appears to fix the problem

see https://stable-baselines.readthedocs.io/en/master/guide/rl_tips.html for specifics

AminHP / gym-mtsim

[Question] - Why different results each time loaded model is run? #3