openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
MIT License
15.75k stars 4.87k forks source link

[Solved] Error using baselines.results_plotter #521

Open damienlancry opened 6 years ago

damienlancry commented 6 years ago

Dear OpenAI and Community,

I am trying to figure out how to use the results_plotter function. I have run a simple test with the atari environment Pong by running

python -m baselines.run --alg=a2c --env=PongNoFrameskip-v4 --num-timesteps=2e7 --save_path=~/models/pong_20M_ppo2

and then I ran

python -m baselines.results_plotter --dirs=/tmp/openai-2018-08-16-18-14-01-752470

And it works perfectly. But then I tried:

python -m baselines.run --alg=ppo2 --env=HalfCheetah-v2 --num_env=8 --save_path=~/models/HC_1M_ppo2

then:

python -m baselines.results_plotter --dirs=/tmp/openai-2018-08-18-17-32-10-748103

And I get the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/homes/drl17/Documents/Project/baselines/results_plotter.py", line 87, in <module>
    main()
  File "/homes/drl17/Documents/Project/baselines/results_plotter.py", line 83, in main
    plot_results(args.dirs, args.num_timesteps, args.xaxis, args.task_name)
  File "/homes/drl17/Documents/Project/baselines/results_plotter.py", line 61, in plot_results
    ts = load_results(dir)
  File "/homes/drl17/Documents/Project/baselines/bench/monitor.py", line 119, in load_results
    df = pandas.read_csv(fh, index_col=None)
  File "/homes/drl17/Documents/Project/Torcs_Project/env3/lib/python3.5/site-packages/pandas/io/parsers.py", line 678, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/homes/drl17/Documents/Project/Torcs_Project/env3/lib/python3.5/site-packages/pandas/io/parsers.py", line 446, in _read
    data = parser.read(nrows)
  File "/homes/drl17/Documents/Project/Torcs_Project/env3/lib/python3.5/site-packages/pandas/io/parsers.py", line 1036, in read
    ret = self._engine.read(nrows)
  File "/homes/drl17/Documents/Project/Torcs_Project/env3/lib/python3.5/site-packages/pandas/io/parsers.py", line 1848, in read
    data = self._reader.read(nrows)
  File "pandas/_libs/parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read
  File "pandas/_libs/parsers.pyx", line 891, in pandas._libs.parsers.TextReader._read_low_memory
  File "pandas/_libs/parsers.pyx", line 945, in pandas._libs.parsers.TextReader._read_rows
  File "pandas/_libs/parsers.pyx", line 932, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas/_libs/parsers.pyx", line 2112, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 3 fields in line 7, saw 5

At first i thought it was because i used the argument num_env=8 but then i realized that atari uses the num_env= multiprocessing.cpu_count() by default. So any idea where this is coming from?

Cheers!

EDIT: it works perfectly on mujoco with num_env = 1 (on Reacher environment) I really think the Parallel monitored environments are using the csv.DictWriter().writerow() function at the same time which is messing with the monitor.csv file. But I can t figure out why it s doing this on mujoco environments and not on atari environments. Anyway the writerow function should be protected by mutex or semaphores (I m not a pro of multiprocessing so i don t know if it is the right terminology). I m going to try to do something about it.

damienlancry commented 6 years ago

I am really struggling with how to use these functions, but they would be really useful ... Does anybody know how to use them properly?

erincmer commented 6 years ago

This happens to me when I log results into same folder. just check if many .monitor files are in the same folder

damienlancry commented 6 years ago

Hi thanks for your answer, Unfortunately I dont think my problem is related to that because I let the default log directories , that is tmp/openai-<date+hour> and as a consequence there is always only one log results per directory. :(

damienlancry commented 6 years ago

Could be because we need at least a hundred episodes per agent and the training only lasts 1M timesteps for all agents if my understanding is correct.

erincmer commented 6 years ago

100 episode can be changed in result_plotter you can set it to even 1. HalfCheetah is long episode environment try Hopper. if there are less episodes it gives negative dimensions are not allowed error, your error is from .monitor files it can not parse them since you may use same monitor file more than once(if 8 env all write to same .monitor.csv) and it overrides , check number of rows in .monitor.csv file it should be 3 rows you have 5 hence it can not parse.

damienlancry commented 6 years ago

yes it is the monitor.csv that is very messy. I think it s not good with multiprocessing on mujoco environments which is a pity. But maybe it s just me not doing it right.

damienlancry commented 6 years ago

Just found out why this is doing this on Mujoco Envs (At least Reacher, HalfCheetah) and not on Atari Envs (At least Pong, Breakout).

This is because in Mujoco Envs, there is a termination condition based on the time taken to complete the episode. As a consequence at least at the beginning of training, every parallel envs have their first done = True at the same time and thus tries to write in 0.monitor.csv all at once. Which results in something like this (opening with vim):

#{"env_id": "Reacher-v2", "t_start": 1534953242.656185}
r,l,t^M
-90.540712,50,1.837869^M
-103.335179,50,1.937895^M
-86.699783,50,2.05136^M
-112.203586,50,2.159039^M
-112.168917,50,2.255508^M
-128.83833,50,2.354578^M
-108.774769,50,2.457605^M
-1-109.082413,50,2.686389-119.676026,50,2.654104^M
-111.755901,50,2.750567---85.499553,50,2.849706^M--93.675179,50,2.955392^M
-104.237215,50,3.063637^M

On the contrary Atari Envs have more variance as far as length of episodes are concerned and done = True is probably very seldom achieved at the same timestep by two different envs. Which results in something like this (still opening with vim):

#{"env_id": "HalfCheetah-v2", "t_start": 1534439643.5216653}
r,l,t^M
-255.917874,1000,9.828193^M
-240.477985,1000,17.473527^M
-602.461787,1000,25.421956^M
-412.191858,1000,33.092345^M
-449.831362,1000,40.931932^M
-357.177889,1000,48.556288^M
-210.854723,1000,56.315493^M
-347.997157,1000,64.064439^M
749.733397,1000,72.024559^M
-327.658974,1000,79.714595^M
-270.476361,1000,87.379286^M
-406.112755,1000,95.23366^M
-444.303552,1000,102.948899^M
-479.953277,1000,110.714009^M
-79.515011,1000,118.378877^M
-417.356146,1000,126.207028^M
-390.424013,1000,133.969844^M

I still do not know how to fix this but am working on it, if somebody has any suggestion, please let me know :)

EDIT: Just realized the last output is from HalfCheetah that i ran not using --num_env=8 (default is 1 on mujoco envs) Could not found previous logs of Atari so reran one just now and interestingly there are actually 8 files monitor.csv. (due to MPI.COMM_WORLD.Get_rank() ) So now Have to figure out why there is only one csv on mujoco envs (there should be 8 of them when using MPI.COMM_WORLD.Get_rank()).

GbengaOdesanmi commented 5 years ago

Please am having similar problem, where to get monitor.csv file please

python -m baselines.results_plotter --dirs=/tmp/openai-2018-11-30-16-53-40-674939 Traceback (most recent call last): File "/home/gbenga/Downloads/abiona1008/envs/tensorflow/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/home/gbenga/Downloads/abiona1008/envs/tensorflow/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/gbenga/baselines/baselines/results_plotter.py", line 95, in main() File "/home/gbenga/baselines/baselines/results_plotter.py", line 91, in main plot_results(args.dirs, args.num_timesteps, args.xaxis, args.yaxis, args.task_name) File "/home/gbenga/baselines/baselines/results_plotter.py", line 68, in plot_results ts = load_results(dir) File "/home/gbenga/baselines/baselines/bench/monitor.py", line 134, in load_results raise LoadMonitorResultsError("no monitor files of the form %s found in %s" % (Monitor.EXT, dir)) baselines.bench.monitor.LoadMonitorResultsError: no monitor files of the form monitor.csv found in /tmp/openai-2018-11-30-16-53-40-674939