IBM / rl-testbed-for-energyplus

Reinforcement Learning Testbed for Power Consumption Optimization using EnergyPlus
MIT License
177 stars 74 forks source link

Learn() 0 vs 2 positional arguments #36

Closed Ryan-Johnson-1315 closed 2 years ago

Ryan-Johnson-1315 commented 4 years ago

This is the output

2019-10-25 10:49:20.032095: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-10-25 10:49:20.056104: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2194795000 Hz 2019-10-25 10:49:20.056700: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3fc7190 executing computations on platform Host. Devices: 2019-10-25 10:49:20.056745: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): , train: init logger with dir=/home/user/eplog/openai-2019-10-25-10-49-20-057253 Logging to /home/user/eplog/openai-2019-10-25-10-49-20-057253 Monitor: filename=/home/user/eplog/openai-2019-10-25-10-49-20-057253 Traceback (most recent call last): File "trpo_mpi/run_energyplus.py", line 58, in main() File "trpo_mpi/run_energyplus.py", line 55, in main train(args.env, num_timesteps=args.num_timesteps, seed=args.seed) File "trpo_mpi/run_energyplus.py", line 50, in train gamma=0.99, lam=0.98, vf_iters=5, vf_stepsize=1e-3) TypeError: learn() takes 0 positional arguments but 2 positional arguments (and 8 keyword-only arguments) were given

I have found the other issue that talked about checking out different commits in baseline to solve the problem, but the output is the exact same.

bonecountysheriff commented 4 years ago

Looks like the project was archived and is not being maintained anymore. With updates to the openai/baselines module, The definitions of a few functions have changed quite a bit. The quickest fix is to turn the first two positional arguments (trpo_mpi.learn(), in baselines_energyplus/trpo_mpi/run_energyplus.py) into keyword-arguments.

  1. Change env to env=env
  2. Replace policy_fn with network='mlp'
  3. Change max_timesteps to total_timesteps

I tested this with baselines v0.1.6, the latest at the moment.

antoine-galataud commented 2 years ago

closing this issue since baselines was upgraded to v0.1.6.