AminHP / gym-mtsim

A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)
MIT License
412 stars 101 forks source link

[Question] - Why different results each time loaded model is run? #3

Closed snafu4 closed 2 years ago

snafu4 commented 2 years ago

Any ideas why the code below (it is basically the same as the code in the README.md file) would return different results each time it is run after the model has been trained? The trained model is no longer being updated and the input (trades) do not change between runs.

` model = A2C.load(modelName, env)

observation = env.reset()
while True:
    action, _states = model.predict(observation)
    observation, reward, done, info = env.step(action)

    state = env.render()
    z = calcsomevalue(state)
    if done:
       x  = getsomevalue(z)
       y = calsomevalue(z)
        print(<results>)
        break

`

snafu4 commented 2 years ago

clarification of my question


import gym
from gym_mtsim import (
    Timeframe, SymbolInfo,
    MtSimulator, OrderType, Order, SymbolNotFound, OrderNotFound,
    MtEnv,
    FOREX_DATA_PATH, STOCKS_DATA_PATH, CRYPTO_DATA_PATH, MIXED_DATA_PATH,
)
from stable_baselines3 import A2C

env = gym.make('forex-hedge-v0')

model = A2C('MultiInputPolicy', env, verbose=0)
model.learn(total_timesteps=1000)

observation = env.reset()
while True:
    action, _states = model.predict(observation)
    observation, reward, done, info = env.step(action)
    if done:
        break

modelName = 'model_test'
model.save(f'{modelName}.zip')

env.render('advanced_figure', time_format="%Y-%m-%d")

### 
# running the loop below should not result in different results each time, should it...
# ... but it does!  What am I missing here?
### 
model = A2C.load(modelName)

for i in range(5):
    observation = env.reset()
    while True:
        action, _states = model.predict(observation)
        observation, reward, done, info = env.step(action)
        if done:
            break

    env.render('advanced_figure', time_format="%Y-%m-%d")
'''
snafu4 commented 2 years ago

model.predict(observation, deterministic=True) appears to fix the problem

see https://stable-baselines.readthedocs.io/en/master/guide/rl_tips.html for specifics