vincekurtz / rddp

Reward-Driven Diffusion Policy
1 stars 0 forks source link

CNN-based score model #9

Open vincekurtz opened 3 months ago

vincekurtz commented 3 months ago

The current score model is a super simple fully connected MLP. We should implement a CNN with temporal convolutions and FiLM conditioning, as recommended by Cheng et al.