Scaling laws in low-data regime

danijar / dreamerv3

Mastering Diverse Domains through World Models

MIT License

1.28k stars 219 forks source link

Hi, First of all thank you for this magnificent work!

We're doing tests with the dreamerv3 repo in a low-data regime (with human feedback). The scaling laws for low-data are a bit confusing to me currently. You added these graphs showing that bigger models do better when tens of millions of frames are available (but unclear for the 400K frame case):

Surprisingly you chose small models for the low-data regime benchmarks:

Did you test e.g. Atari100K with bigger models? If yes, are bigger models worse (or just not better)? If you did not test them with bigger models, what was the reason (e.g. related to the theory of Deep double descent)?

danijar / dreamerv3

Scaling laws in low-data regime #94