Open sengengie opened 5 years ago
From what I've seen it should be pretty much possible to add any environment with a discrete action space. If you want to add a custom environment is actually really simple, you can use the gym preprocessing at https://github.com/google/dopamine/blob/76cdae1f858233a8501e2b61095cde54c6f8a214/dopamine/discrete_domains/gym_lib.py#L306
This lets you create any environment you want as long as it has the functions that it needs and the variables that it needs. How you handle the observations and use them to create a set of q values for the actions will also be up to you. You need to create a function that returns a network and pass it to your rainbow agent when you are creating it.
Hello, I work on AI project. I need to create with gym an environment and test it on dopamine with his agent (rainbow, dqn, ...). I already created this environment but cant launch him with Dopamine:
My environment is a text classifier. The AI read a sentence and them she can choose one language in list, so she need to detect language in the current sentence. Every step, she read another sentence and give an answer ect... I give to AI a string value with no static length.
I observed the c51_cartpole.gin and gym_lib.py to try to understand how can add my environment to dopamine project.
Language is name of my model
"gym_lib": CARTPOLE_MIN_VALS = np.array([-2.4, -5., -math.pi/12., -math.pi2.]) CARTPOLE_MAX_VALS = np.array([2.4, 5., math.pi/12., math.pi2.]) LANGUAGE_MIN_VALS = "" LANGUAGE_MAX_VALS = "" ACROBOT_MIN_VALS = np.array([-1., -1., -1., -1., -5., -5.]) ACROBOT_MAX_VALS = np.array([1., 1., 1., 1., 5., 5.]) gin.constant('gym_lib.CARTPOLE_OBSERVATION_SHAPE', (4, 1)) gin.constant('gym_lib.CARTPOLE_OBSERVATION_DTYPE', tf.float32) gin.constant('gym_lib.CARTPOLE_STACK_SIZE', 1) gin.constant('gym_lib.ACROBOT_OBSERVATION_SHAPE', (6, 1)) gin.constant('gym_lib.ACROBOT_OBSERVATION_DTYPE', tf.float32) gin.constant('gym_lib.ACROBOT_STACK_SIZE', 1) gin.constant('gym_lib.LANGUAGE_OBSERVATION_SHAPE', (1, 1)) gin.constant('gym_lib.LANGUAGE_OBSERVATION_DTYPE', tf.string) gin.constant('gym_lib.LANGUAGE_STACK_SIZE', 51)
I try to follow the current organisation but i think i have problem with entry data.
"c51_language.gin"
Hyperparameters for a simple C51-style Language agent. The hyperparameters
chosen achieve reasonable performance.
import dopamine.agents.dqn.dqn_agent import dopamine.agents.rainbow.rainbow_agent import dopamine.discrete_domains.gym_lib import dopamine.discrete_domains.run_experiment import dopamine.replay_memory.prioritized_replay_buffer import gin.tf.external_configurables
RainbowAgent.observation_shape = %gym_lib.LANGUAGE_OBSERVATION_SHAPE RainbowAgent.observation_dtype = %gym_lib.LANGUAGE_OBSERVATION_DTYPE RainbowAgent.stack_size = %gym_lib.LANGUAGE_STACK_SIZE RainbowAgent.network = @gym_lib.language_rainbow_network RainbowAgent.num_atoms = 51 RainbowAgent.vmax = 10. RainbowAgent.gamma = 0.99 RainbowAgent.update_horizon = 1 RainbowAgent.min_replay_history = 500 RainbowAgent.update_period = 4 RainbowAgent.target_update_period = 100 RainbowAgent.epsilon_fn = @dqn_agent.identity_epsilon RainbowAgent.replay_scheme = 'uniform' RainbowAgent.tf_device = '/gpu:0' # use '/cpu:*' for non-GPU version RainbowAgent.optimizer = @tf.train.AdamOptimizer()
tf.train.AdamOptimizer.learning_rate = 0.001 tf.train.AdamOptimizer.epsilon = 0.0003125
create_gym_environment.environment_name = 'language' create_gym_environment.version = 'v0' create_agent.agent_name = 'rainbow' Runner.create_environment_fn = @gym_lib.create_gym_environment Runner.num_iterations = 500 Runner.training_steps = 1000 Runner.evaluation_steps = 1000 Runner.max_steps_per_episode = 200 # Default max episode length.
WrappedPrioritizedReplayBuffer.replay_capacity = 50000 WrappedPrioritizedReplayBuffer.batch_size = 128
Is it possible to add gym environment with string entry in dopamine ?