Closed martintomov closed 5 days ago
Hey, those frames were generated without a trained agent, just by selecting random actions. Looking at the samples, you still get to see some variety of "events" happening (loosing health, shooting a target etc..) so it's a good base to start off doing some first passes on the diffusion model. Eventually, it'll have to be trained, but for some "simpler" vizdoom scenarios, I think the random action route is not a bad place to start.
Re. your second point, I think it's probably better to wait for the agent to be trained before doing some actual data collection. Using some random policy in the meantime can help you parallelize the two IMO.
Hi,
I'm currently training a
stable_baselines3
agent on ViZDoom and came across yourgamengen_test_dataset
on Hugging Face. I have a couple of questions regarding your setup:Training Timesteps: How many timesteps did you use to achieve your current results? So far, I've experimented with
1,000,000
and3,000,000
timesteps.Data Collection for Diffusion Model: Are you collecting data during the RL agent's training for later diffusion model training, or do you only start data collection after the RL agent is fully trained to ensure the gameplay data is more "human-like"?
Looking forward to your insights.