arnaudstiegler / gameNgen-repro

Apache License 2.0
5 stars 5 forks source link

Training Timesteps for Stable Baselines3 Agent and Data Collection Process for Diffusion Model #1

Closed martintomov closed 5 days ago

martintomov commented 1 month ago

Hi,

I'm currently training a stable_baselines3 agent on ViZDoom and came across your gamengen_test_dataset on Hugging Face. I have a couple of questions regarding your setup:

  1. Training Timesteps: How many timesteps did you use to achieve your current results? So far, I've experimented with 1,000,000 and 3,000,000 timesteps.

  2. Data Collection for Diffusion Model: Are you collecting data during the RL agent's training for later diffusion model training, or do you only start data collection after the RL agent is fully trained to ensure the gameplay data is more "human-like"?

Looking forward to your insights.

arnaudstiegler commented 1 month ago

Hey, those frames were generated without a trained agent, just by selecting random actions. Looking at the samples, you still get to see some variety of "events" happening (loosing health, shooting a target etc..) so it's a good base to start off doing some first passes on the diffusion model. Eventually, it'll have to be trained, but for some "simpler" vizdoom scenarios, I think the random action route is not a bad place to start.

Re. your second point, I think it's probably better to wait for the agent to be trained before doing some actual data collection. Using some random policy in the meantime can help you parallelize the two IMO.