a = agent.learn(total_timesteps=10_000, callback=callback)
An error was encountered as follow:
Traceback (most recent call last):
File "d:/optical-rl-gym-main/examples/stable_baselines3/SimpleRWA.py", line 136, in
a = agent.learn(total_timesteps=10_000, callback=callback)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\ppo\ppo.py", line 326, in learn
reset_num_timesteps=reset_num_timesteps,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 255, in learn
progress_bar,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\base_class.py", line 489, in _setup_learn
self._last_obs = self.env.reset() # pytype: disable=annotation-type-mismatch
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 64, in reset
self._save_obs(env_idx, obs)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 96, in _save_obs
self.buf_obs[key][env_idx] = obs[key]
KeyError: 'current_service'
If I modify the observation() in reset(), changing key 'service' to 'current_service', then it will further report
Traceback (most recent call last):
File "d:/optical-rl-gym-main/examples/stable_baselines3/SimpleRWA.py", line 136, in
a = agent.learn(total_timesteps=10_000, callback=callback)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\ppo\ppo.py", line 326, in learn
reset_num_timesteps=reset_num_timesteps,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 255, in learn
progress_bar,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\base_class.py", line 489, in _setup_learn
self._last_obs = self.env.reset() # pytype: disable=annotation-type-mismatch
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 64, in reset
self._save_obs(env_idx, obs)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 96, in _save_obs
self.buf_obs[key][env_idx] = obs[key]
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Service'
Actually, if you test the env with
from stable_baselines3.common.env_checker import check_env
check_env(env)
It will show you the same error.
Please help me with this, thanks.
hi there, I was trying to use your gym and stable_baseline3 for RWA problem with the following code:
Environment arguments for the simulation
env_args = dict(topology=topology, seed=10, allow_rejection=True, load=load, mean_service_holding_time=25, episode_length=episode_length, num_spectrum_resources=64) env = gym.make('RWA-v0', **env_args)
here goes the arguments of the policy network to be used
policy_args = dict(net_arch=5*[128]) # we use the elu activation function
agent = PPO(MlpPolicy, env, verbose=0, tensorboard_log="./tb/PPO-RWA-v0/", policy_kwargs=policy_args, gamma=.95, learning_rate=10e-6)
a = agent.learn(total_timesteps=10_000, callback=callback)
An error was encountered as follow: Traceback (most recent call last): File "d:/optical-rl-gym-main/examples/stable_baselines3/SimpleRWA.py", line 136, in
a = agent.learn(total_timesteps=10_000, callback=callback)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\ppo\ppo.py", line 326, in learn
reset_num_timesteps=reset_num_timesteps,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 255, in learn
progress_bar,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\base_class.py", line 489, in _setup_learn
self._last_obs = self.env.reset() # pytype: disable=annotation-type-mismatch
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 64, in reset
self._save_obs(env_idx, obs)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 96, in _save_obs
self.buf_obs[key][env_idx] = obs[key]
KeyError: 'current_service'
If I modify the observation() in reset(), changing key 'service' to 'current_service', then it will further report Traceback (most recent call last): File "d:/optical-rl-gym-main/examples/stable_baselines3/SimpleRWA.py", line 136, in
a = agent.learn(total_timesteps=10_000, callback=callback)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\ppo\ppo.py", line 326, in learn
reset_num_timesteps=reset_num_timesteps,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 255, in learn
progress_bar,
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\base_class.py", line 489, in _setup_learn
self._last_obs = self.env.reset() # pytype: disable=annotation-type-mismatch
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 64, in reset
self._save_obs(env_idx, obs)
File "C:\Users\ \AppData\Local\Programs\Python\Python37\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 96, in _save_obs
self.buf_obs[key][env_idx] = obs[key]
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Service'
Actually, if you test the env with from stable_baselines3.common.env_checker import check_env check_env(env)
It will show you the same error. Please help me with this, thanks.