seungjaeryanlee / agents

TF-Agents is a library for Reinforcement Learning in TensorFlow
Apache License 2.0
1 stars 0 forks source link

Implement the basic elements of RND #3

Closed seungjaeryanlee closed 5 years ago

seungjaeryanlee commented 5 years ago

Implemented

  1. RND target and predictor networks are defined
  2. RND loss is calculated and used as intrinsic reward
  3. RND predictor network is trained via average (among experience batch) RND loss

To Be Implemented

  1. RND should be usable with every agent type (currently only paired with DQN)
  2. An environment wrapper to make it non-episodic (Section 2.3)
  3. Q-Network with dual value head for intrinsic/extrinsic rewards (Section 2.3)
  4. Intrinsic Reward Normalization (Section 2.4)
  5. Observation Normalization (Section 2.4)
  6. Separate discount factors for intrinsic and extrinsic rewards (Section 3.3)
  7. CNN vs RNN policy (Section 3.5)