AminHP / gym-anytrading

The most simple, flexible, and comprehensive OpenAI Gym trading environment (Approved by OpenAI Gym)
MIT License
2.1k stars 465 forks source link

Confusion over results if run multiple times #44

Closed mkeywood1 closed 3 years ago

mkeywood1 commented 3 years ago

Hi, Sorry if this is the wrong place to ask this but I couldn't find anywhere else. I love the package, excellent work, and I'm sure I am doing something wrong but I wonder if you can explain why when I run the same thing multiple times I get such wildly different results. For example, running this to train / test it 10 times:

import gym
import gym_anytrading
from gym_anytrading.envs import TradingEnv, ForexEnv, Actions, Positions 
from gym_anytrading.datasets import FOREX_EURUSD_1H_ASK, STOCKS_GOOGL

env = gym.make('forex-v0', frame_bound=(50, 100), window_size=10)

for i in range(10):
  observation = env.reset()
  while True:
      action = env.action_space.sample()
      observation, reward, done, info = env.step(action)
      if done:
          print("info:", info)
          break

I get:

info: {'total_reward': -50.99999999999439, 'total_profit': 0.9875980085384239, 'position': 0}
info: {'total_reward': 24.099999999995784, 'total_profit': 0.9886818462999193, 'position': 1}
info: {'total_reward': 24.499999999987313, 'total_profit': 0.9893252791394607, 'position': 0}
info: {'total_reward': 138.10000000000767, 'total_profit': 0.9953009801405461, 'position': 1}
info: {'total_reward': 107.10000000001328, 'total_profit': 0.9926679505350279, 'position': 1}
info: {'total_reward': 127.00000000000375, 'total_profit': 0.996177843192774, 'position': 0}
info: {'total_reward': -144.90000000000117, 'total_profit': 0.9813550423422519, 'position': 1}
info: {'total_reward': -128.90000000000293, 'total_profit': 0.9843355398695747, 'position': 0}
info: {'total_reward': 45.699999999999626, 'total_profit': 0.9912142586709967, 'position': 0}
info: {'total_reward': -39.39999999999389, 'total_profit': 0.9859639867316038, 'position': 1}

Wouldn't I expect them all to have the same position given it's the same data / training, or am I missing something fundamental here?

Thank you

AminHP commented 3 years ago

Hi @mkeywood1, thank you :)

The actions here are selected randomly, so the results will not be the same.

action = env.action_space.sample()

mkeywood1 commented 3 years ago

@AminHP, I can’t believe how silly I was there. How embarrassing. Thank you.