Open Andy-Mielk opened 1 month ago
Hi @Andy-Mielk,
thanks for your question. The weights of the player are the same of the weights in the world model. The agent_p.data
is a tensor, so, every time you update the world model weights, the player will read the updated weights of the world model: they share the tensors of the weight, and they point to the same object.
Hello, I'm studying the DreamerV1 implementation in your repository, and I have a question about how the weights are synchronized between the player and the main models (world_model, actor). In the build_agent function, I see that the player is initialized with deep copies of the main model components:
Then, the weights are copied from the main models to the player:
If I understand correctly, the player is responsible for the environment interaction, so its weight must be same as the world model and actor along with their update. However, this appears to be a one-time copy rather than a continuous synchronization. I'm wondering: How are the player's weights kept in sync with the main models during training?
Thank you for your time!