Trying for re-implementation

kyonofx / tia

[ICML 2021] Learning Task Informed Abstractions -- a representation learning approach for model-based RL in complex visual domains

https://xiangfu.co/tia

17 stars 3 forks source link

Trying for re-implementation #4

Open gunnxx opened 11 months ago

gunnxx commented 11 months ago

Hi, thanks for the great paper and nice code!

I am wondering whether you have encountered the case where the distractor variable $s_t^-$ reconstructs the agent and the task variable $s_t^+$ reconstructs the background? That solution is non-optimal under the given objective but apparently the network can't escape this local optima.

kyonofx commented 11 months ago

Hi,

in principle, because there is a loss for predicting reward from the task state, to minimize the total loss the model should make the right attribution. I suggest trying a higher loss coefficient for the reward to further encourage the task model to capture reward-correlated information.