Closed vwxyzjn closed 2 years ago
HI @vwxyzjn
Thanks for making this request. This is actually a feature we are in the early stages of putting together right now. We agree that it would be very useful for users like yourself, and we hope to have more to share in the coming months.
+1 for the request.
Another benefit brought by using gym.make("") will be video replay.
Without that, is there anyway for for video replay at the moment? @awjuliani
Hi @vwxyzjn and @yijiezh
Our next release of ML-Agents (coming this week) will actually include an environment registry. You can read about it on the documentation for our master branch: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Unity-Environment-Registry.md.
Thanks @awjuliani ! This looks wonderful.
Will this support gym 'monitor' interface for video replay?
@yijiezh btw the Monitor
issue is also mentioned in #3954 :)
Thanks @vwxyzjn
I am pretty new to Unity so correct me if I am wrong.
I think gym monitor can only be applied when the env was created by gym.make
.
It's not applied to the envs that created by UnityToGymWrapper
?
As long as the environment implements the env.render(mode='rgb_array')
, Monitor can be applied.
How can I know if the env implemented the rendering given that the env file is a Unity binary?
print(env.render(mode='rgb_array'))
this would make the code simpler, less if env_name in unity_env_names: unity_env = UnityEnvironment(PATH_LOOKUP[env_name])
Issue logged as MLA-1943
I insist on pushing the gym wrapper to be fully compatible with gym so we can all use the algorithms that are already implemented outside. It takes too much overhead adapting every single one of them TO the UnityEnvironment API.
Glad to see a lot of feedback in this thread. I'd add a couple more things.
If the gym API is too slow, one thing to consider is to use the vectorize environment API. This is an approach that is being done by procgen, gym-microrts and others.
So using the gym API with SB3, it looks like the following
import gym
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecMonitor, VecVideoRecorder, DummyVecEnv
env = DummyVecEnv([lambda: gym.make("procgen:starpilot")])
# Record the video starting at the first step
env = VecVideoRecorder(env, 'logs/videos/',
record_video_trigger=lambda x: x == 0, video_length=100)
# Wrap with a VecMonitor to collect stats and avoid errors
env = VecMonitor(env=env)
model = PPO("MultiInputPolicy", env, verbose=1)
model.learn(10000)
Whereas using the vectorized environment API, it looks like this
- import gym
+ from procgen import ProcgenEnv
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecMonitor, VecVideoRecorder, DummyVecEnv
# ProcgenEnv is already vectorized
- env = DummyVecEnv([lambda: gym.make("procgen:starpilot")])
+ env = ProcgenEnv(num_envs=2, env_name='starpilot')
# Record the video starting at the first step
env = VecVideoRecorder(env, 'logs/videos/',
record_video_trigger=lambda x: x == 0, video_length=100)
# Wrap with a VecMonitor to collect stats and avoid errors
env = VecMonitor(env=env)
model = PPO("MultiInputPolicy", env, verbose=1)
model.learn(10000)
When setting num_envs=1
, this vectorize environment would also work with DQN from SB3.
import gym_unity
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecMonitor, VecVideoRecorder
# ProcgenEnv is already vectorized
env = gym_unity.VecEnv(num_envs=2, env_name='GridWorldPixels')
# Record the video starting at the first step
env = VecVideoRecorder(env, 'logs/videos/',
record_video_trigger=lambda x: x == 0, video_length=100)
# Wrap with a VecMonitor to collect stats and avoid errors
env = VecMonitor(env=env)
model = PPO("MultiInputPolicy", env, verbose=1)
model.learn(10000)
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Is your feature request related to a problem? Please describe. The current
gym_unity
package is not a typical gym extension. It requires the users to download Unity, obtain license, compile binaries, then use theUnityToGymWrapper
to convert aUnityEnvironment
into a gym environment. The whole procedure is fairly burdensome for researchers who just want to run some experiments instead of creating game environments.Describe the solution you'd like For a standard gym extension, see https://github.com/maximecb/gym-minigrid. The users should only need to do the following
and run the following file
This is a generally accepted API and there are many benefits to it.
Implementation detail
To achieve this simplicity and easiness to use, the
gym_unity
should handle the download of the binaries itself. A very crude way of doing so is demonstrated above by having a URL and when the user callsgym.make("GridWorld-v0")
for the first time, thegym_unity
package should automatically download the pre-compiled binaries. Such a procedure can further be fully automated by using CI/CD. As an example,gym-microrts
always build the binaries after a commit is made (http://microrts.s3-website-us-east-1.amazonaws.com/microrts/artifacts/)I see great potential with
gym_unity
as a convenient replacement for many commonly used games in gym such as CartPole-v0 and standard Mujoco tasks. It would be fantastic if this feature request gets fulfilled.Thanks.