Improve the training procedure

vincekurtz / rddp

Reward-Driven Diffusion Policy

1 stars 0 forks source link

Open vincekurtz opened 3 months ago

vincekurtz commented 3 months ago

The current training loop is fairly efficient, but lacks some features. Eventually we will want:

[ ] A proper validation set (should be generated independently, since data points are correlated)
[ ] Logging (to tensorboard or similar)
[ ] Input normalization