danijar / dreamerv3

Mastering Diverse Domains through World Models
https://danijar.com/dreamerv3
MIT License
1.12k stars 194 forks source link

Fully deterministic runs #43

Open jadkins99 opened 1 year ago

jadkins99 commented 1 year ago

Awesome repo. quick question,

I ran the DMC WalkerWalk experiment 3 different times with the same seeds and got 3 different learning curves. How can I get reproducible experiments?Awesome repo. quick question,

I ran the DMC WalkerWalk experiment 3 different times with the same seeds and got 3 different learning curves. How can I get reproducible experiments? curves curves curves

danijar commented 1 year ago

Hi, are you asking for fully deterministic runs? I haven't paid much attention to this but I think the agent is already fully deterministic, so you'd probably just have to set the environment seed (make sure if you use more than 1 environment instance, that the environments have different seeds so they produce different data).

jadkins99 commented 1 year ago

Okay I will try that. Thank you for the quick response! What exactly is an "environment instance"? I couldn't find a clear definition in the paper.

jadkins99 commented 1 year ago

Also, how many seeds were the non-Minecraft experiments run for?

subho406 commented 1 year ago

+1 on the question above. Maybe it's not that apparent in the paper, could you also provide some clarification on what the confidence intervals denote in the non-minecraft experiments (DMLab, DMC Proprio, Crafter, etc)? Is it std-error across multiple seeds, or std-error across a window of timesteps with a single seed, or something else?

danijar commented 1 year ago

It's mean/std across seeds and at least 3 seeds per task, often more.

jadkins99 commented 1 year ago

Update: I seeded dmc_control here. And still got non-deterministic runs. Are there other non-environment sources of randomness not seeded?

jadkins99 commented 1 year ago

I found some non-seeded randomness in the repo. Namely here and here. Wouldn't these affect the agent?

danijar commented 1 year ago

I don't think those two methods are run ever. Could you check e.g. by adding asdf to the two methods to see if it errors?

swannercjj commented 10 months ago

Seeding this it removes randomness from the first 1000 steps, but runs are non-deterministic afterwards.