danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.28k stars 219 forks source link

Scaling laws in low-data regime #94

Closed bseveren closed 5 months ago

bseveren commented 11 months ago

Hi, First of all thank you for this magnificent work!

We're doing tests with the dreamerv3 repo in a low-data regime (with human feedback). The scaling laws for low-data are a bit confusing to me currently. You added these graphs showing that bigger models do better when tens of millions of frames are available (but unclear for the 400K frame case):

image

Surprisingly you chose small models for the low-data regime benchmarks:

image

Did you test e.g. Atari100K with bigger models? If yes, are bigger models worse (or just not better)? If you did not test them with bigger models, what was the reason (e.g. related to the theory of Deep double descent)?

danijar commented 5 months ago

Hi, good observation. I'm using the bigger models on Atari100k in the updated paper. But there is a limit at which, when the task is really simple and the model really large, increasing model size further does not help. Then, it can make sense to use a smaller model for speed up wall clock time and thus iterate faster.

image