SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL
https://genrl.readthedocs.io
MIT License
404 stars 59 forks source link

Off Policy Trainer self._max_episode_len does not update #390

Open bscoventry opened 3 years ago

bscoventry commented 3 years ago

Hello,

I'm running into an interesting issue where self._max_episode_len is always [] in /environments/time_limit.py

System: Windows 10 (most recent update) GPU: GTX1080 CPU: Core i7

Relevant Code: `if name == 'main': """Setup simulation variables""" envRender = True envLogMode = ['stdout','tensorboard','csv'] save_intervals = 100 envLogDir = "./logs" saveLog = "./models" maxTimeSteps = 1000 max_ep_len = 1000

"""Setup simulation: Setup as a TD3 agent"""
env = VectorEnv("spiker-v0")
agent = TD3("mlp",env) 
trainer = OffPolicyTrainer(agent,env,max_timesteps = maxTimeSteps,render = envRender,log_mode = envLogMode, log_interval = save_intervals, logdir = envLogDir, save_model = saveLog, max_ep_len = max_ep_len)

"""Run the simulation"""
trainer.train()
trainer.evaluate()`

spiker-v0 is a custom gym environment, but this also occurs on more standard datasets. Current work around is to set the initialization of max_episode_len on the initializer, but this is not satisfying.

Relevant error: Traceback (most recent call last): File "c:/CodeRepos/gym-spiker/SpikerNet_Main.py", line 33, in <module> trainer.train() File "C:\Users\bcovent\AppData\Local\Programs\Python\Python36\lib\site-packages\genrl\trainers\offpolicy.py", line 92, in train next_state, reward, done, _ = self.env.step(action) File "C:\Users\bcovent\AppData\Local\Programs\Python\Python36\lib\site-packages\genrl\environments\vec_env\vector_envs.py", line 159, in step obs, reward, done, info = env.step(actions[i]) File "C:\Users\bcovent\AppData\Local\Programs\Python\Python36\lib\site-packages\genrl\environments\gym_wrapper.py", line 92, in step self.state, self.reward, self.done, self.info = self.env.step(action) File "C:\Users\bcovent\AppData\Local\Programs\Python\Python36\lib\site-packages\genrl\environments\time_limit.py", line 20, in step if self._steps_taken >= self._max_episode_len: TypeError: '>=' not supported between instances of 'int' and 'NoneType'

No matter how it is set in trainer init the same error occurs.

Thank you for your time