IntelLabs / coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
https://intellabs.github.io/coach/
Apache License 2.0
2.32k stars 460 forks source link

Implementation detail in QRDQN #397

Open Officium opened 4 years ago

Officium commented 4 years ago

The estimation of quantile values must be increasing in theory. In practice, it should be ensured by loss function instead of sorting because the quantile regression for a particular transition uses same collection of targets with diffrent quantile parameter \tau.

In code, we should remove sort operation in https://github.com/NervanaSystems/coach/blob/fc5039854416064b5ef7938b707495d347776885/rl_coach/agents/qr_dqn_agent.py#L121