danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.36k stars 229 forks source link

Preventing over fitting in reconstruction and dynamics loss #150

Closed truncs closed 3 months ago

truncs commented 3 months ago

When training over the long duration it seems like even though the training loss is decreasing the the open loop loss and report loss continuously increases after bottoming out (see the attached graph).

train_report_camera_loss

Similar pattern was observed for dynamics loss for all experiments

dyn_loss

It this correct to interpret that this just means that the model is over fitting to the existing data in the replay buffer while under fitting to new data and hence the model is not general enough? I do see that the reward score is still improving (see screenshot below), but I assume that might be because of the SAC policy improving (see actor loss screenshot below) loss_with_
![actor_loss](https://github.com/user-attachments/assets/908e304b-cb19-4dc2-81dd-20b98dc2d6cc)
score

Similar behavior of divergence between train and openloop was observed in critic loss as well critic_loss

My questions are

  1. Is this interpretation of over fitting correct?
  2. Does this lead to less sample efficiency?
  3. Any strategies to mitigate it?

Would love to know your thoughts on this.

danijar commented 3 months ago

You can choose a run script with eval metrics to diagnose overfitting. Losses going up is normal when the agent explores the environment further and sees more diverse data over time.