Closed Nadavborenstein1 closed 4 years ago
Hello,
Please follow the example in the documentation. The Monitor
wrapper applies to a gym env, not a VecEnv
(you need to wrap each env of the VecEnv with monitor), you can find a complete example in the rl zoo.
Maybe it could be a good idea to have a VecMonitorWrapper
as it is done in the baselines (https://github.com/openai/baselines/blob/master/baselines/common/vec_env/vec_monitor.py), feel free to submit a PR for that.
When following the example by applying the Monitor on the actual env and not the VecEnv it seems that the resulting log file gets corrupted. The callback fails with:
Traceback (most recent call last): File "RoomControl_SubProcCallback.py", line 86, in <module> model.learn(total_timesteps=total_timesteps, callback=callback) File "/big/openai/stable-baselines/stable_baselines/ppo2/ppo2.py", line 400, in learn if callback(locals(), globals()) is False: File "RoomControl_SubProcCallback.py", line 48, in callback x, y = ts2xy(load_results(log_dir), 'timesteps') File "/big/openai/stable-baselines/stable_baselines/bench/monitor.py", line 180, in load_results data_frame = pandas.read_csv(file_handler, index_col=None) File "/big/innovation/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f return _read(filepath_or_buffer, kwds) File "/big/innovation/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 463, in _read data = parser.read(nrows) File "/big/innovation/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 1154, in read ret = self._engine.read(nrows) File "/big/innovation/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 2059, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 881, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 896, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 950, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 937, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2132, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 5
The contents of 'monitor.csv' is at the moment of failing:
#{"t_start": 1574695681.5302866, "env_id": "RoomControl-v2"} 11398.178784,7200,121.623053131400.975874,7200,178.948676
What should the format of monitor.csv be?
@jbulow Please follow the rules and open if needed an issue with the issue template completely filled. Note that it seems the error comes from your custom env. Please double check it. As mentioned in the readme, we don't do tech support nor consulting so your issue will be closed if it does not comes from stable baselines.
EDIT: it seems you are using one monitor file for multiple envs, this won't work
I could not find any documentation for Monitor other than the source. As you say, Monitor does not support multiple envs with common log file but does not fail when one tries to do that. I guess opening the log file with O_EXCL might fix the problem. Should I create an issue for this or does it work as intended? With the current design it's not straight-forward how to implement "save-new-best-model" in a vectorized setup.
Disclaimer: I'm very new to all this and thought that giving feedback was a way to improve stable baselines. In this context I don't have a clear understanding of what is considered "technical support". Point taken regarding opening issues using the template. (github suggested this defect when I was about to open a new issue)
EDIT: after reading the actual source code for load_results I found that the documentation is not correct. load_results loads all files ending in 'monitor.csv' from the given path and not just "Load results from a given file" as the documentation states. Now everything works fine!
I also ran into similar problems with monitoring vectorized environments. It was straight forward to tweak VecMonitor from OpenAI Baselines to work with Stable Baselines, as suggested above. This is what I ended up with.
from stable_baselines.common.vec_env import VecEnvWrapper
import numpy as np
import time
from collections import deque
import os.path as osp
import json
import csv
class VecMonitor(VecEnvWrapper):
EXT = "monitor.csv"
def __init__(self, venv, filename=None, keep_buf=0, info_keywords=()):
VecEnvWrapper.__init__(self, venv)
print('init vecmonitor: ',filename)
self.eprets = None
self.eplens = None
self.epcount = 0
self.tstart = time.time()
if filename:
self.results_writer = ResultsWriter(filename, header={'t_start': self.tstart},
extra_keys=info_keywords)
else:
self.results_writer = None
self.info_keywords = info_keywords
self.keep_buf = keep_buf
if self.keep_buf:
self.epret_buf = deque([], maxlen=keep_buf)
self.eplen_buf = deque([], maxlen=keep_buf)
def reset(self):
obs = self.venv.reset()
self.eprets = np.zeros(self.num_envs, 'f')
self.eplens = np.zeros(self.num_envs, 'i')
return obs
def step_wait(self):
obs, rews, dones, infos = self.venv.step_wait()
self.eprets += rews
self.eplens += 1
newinfos = list(infos[:])
for i in range(len(dones)):
if dones[i]:
info = infos[i].copy()
ret = self.eprets[i]
eplen = self.eplens[i]
epinfo = {'r': ret, 'l': eplen, 't': round(time.time() - self.tstart, 6)}
for k in self.info_keywords:
epinfo[k] = info[k]
info['episode'] = epinfo
if self.keep_buf:
self.epret_buf.append(ret)
self.eplen_buf.append(eplen)
self.epcount += 1
self.eprets[i] = 0
self.eplens[i] = 0
if self.results_writer:
self.results_writer.write_row(epinfo)
newinfos[i] = info
return obs, rews, dones, newinfos
class ResultsWriter(object):
def __init__(self, filename, header='', extra_keys=()):
print('init resultswriter')
self.extra_keys = extra_keys
assert filename is not None
if not filename.endswith(VecMonitor.EXT):
if osp.isdir(filename):
filename = osp.join(filename, VecMonitor.EXT)
else:
filename = filename # + "." + VecMonitor.EXT
self.f = open(filename, "wt")
if isinstance(header, dict):
header = '# {} \n'.format(json.dumps(header))
self.f.write(header)
self.logger = csv.DictWriter(self.f, fieldnames=('r', 'l', 't')+tuple(extra_keys))
self.logger.writeheader()
self.f.flush()
def write_row(self, epinfo):
if self.logger:
self.logger.writerow(epinfo)
self.f.flush()
@ajtanskanen Will you marry me?
btw, VecMonitor
is now included in SB3: https://github.com/DLR-RM/stable-baselines3
Many thanks @ajtanskanen.
So in SB3 simply:
from stable_baselines3.common.vec_env import VecMonitor
env = VecMonitor(env, log_dir)
instead of
from stable_baselines3.common.monitor import Monitor
env = Monitor(env, log_dir) # won't work with vectorized enviroments, will throw cryptic errors
sorry for off-topic, googled the error and got here
@araffin do you know which SB3 library version VecMonitor
was introduced in? I'm using 2.0.0 but I still get monitoring errors.
I am training an A2C algorithm on a custom environment using multiprocessing and SubprocVecEnv as follows:
`env = SubprocVecEnv([lambda: CustomEnv(args, i) for i in range(args.cpus)])
I want to monitor the learning and save model checkpoints using a Monitor and callbacks, however I can't seem to figure out how to combine everything. I've tried doing
`env = SubprocVecEnv([lambda: CustomEnv(args, i) for i in range(args.cpus)]) env = Monitor(env, log_dir, allow_early_resets=True)
but I get the following exception:
So what is the correct way of using a monitor in this setting?