gsurma / atari

AI research environment for the Atari 2600 games 🤖.
https://gsurma.github.io
MIT License
253 stars 81 forks source link

Exponential increase in loss #7

Open hariharan-jayakumar opened 4 years ago

hariharan-jayakumar commented 4 years ago

Hi @gsurma,

Thank you for the wonderful code and the medium article. I tried implementing your code but found that the loss function in my model shoots off after some time.

These are the hyper-parameters I used:

initialize environment

env = MainGymWrapper.wrap(gym.make('SpaceInvaders-v0'))

env = gym.make('SpaceInvaders-v0')

define hyperparameters

total_step_limit = 5000000 wandb.config.episodes = 1000 GAMMA = 0.99 MEMORY_SIZE = 350000 BATCH_SIZE = 32 TRAINING_FREQUENCY = 4 TARGET_NETWORK_UPDATE_FREQUENCY = 40000 MODEL_PERSISTENCE_UPDATE_FREQUENCY = 10000 REPLAY_START_SIZE = 50000 action_size = env.action_space.n EXPLORATION_MAX = 1.0 EXPLORATION_MIN = 0.1 EXPLORATION_TEST = 0.02 EXPLORATION_STEPS = 425000 EXPLORATION_DECAY = (EXPLORATION_MAX-EXPLORATION_MIN)/EXPLORATION_STEPS wandb.config.batch_size = 32 wandb.config.learning_rate = 0.00025 input_shape = (4, 84, 84)

The CNN is the same. I also used np.sign for the rewards I got.

Can you guide me on what might be possibly going wrong?

Capture