Implement the basic elements of RND - Githubissues

seungjaeryanlee / agents

TF-Agents is a library for Reinforcement Learning in TensorFlow

Apache License 2.0

1 stars 0 forks source link

Implement the basic elements of RND #3

Closed seungjaeryanlee closed 5 years ago

seungjaeryanlee commented 5 years ago

Implemented

RND target and predictor networks are defined
RND loss is calculated and used as intrinsic reward
RND predictor network is trained via average (among experience batch) RND loss

To Be Implemented

RND should be usable with every agent type (currently only paired with DQN)
An environment wrapper to make it non-episodic (Section 2.3)
Q-Network with dual value head for intrinsic/extrinsic rewards (Section 2.3)
Intrinsic Reward Normalization (Section 2.4)
Observation Normalization (Section 2.4)
Separate discount factors for intrinsic and extrinsic rewards (Section 3.3)
CNN vs RNN policy (Section 3.5)