hjsuh94 / score_po

Score-Guided Planning
10 stars 0 forks source link

Adding U-Net for diffusion model #33

Open hongkai-dai opened 1 year ago

hongkai-dai commented 1 year ago

It seems that most diffusion papers use U-Net (or U-Net with FiLM structure for conditional input) instead of MLP for the diffusion model. We can consider adding our own U-Net.

hjsuh94 commented 1 year ago

My impression is that U-Net will only be useful when we deal with images. For vector data, I'm not sure how much it will provide inductive bias.

But we should definitely add it for pixel-domain examples!

hongkai-dai commented 1 year ago

Sounds good! I was checking Janner's code and saw that they use U-Net for their state/action pairs https://github.com/jannerm/diffuser/blob/main/diffuser/models/temporal.py.

You mind if I add some preliminary implementation on U-Net as an exercise? I am having some problem to fit a good score function with my MLP on the cart-pole example, so I am trying to debug what is happening. One candidate is to switch to a different network structure.

hjsuh94 commented 1 year ago

That sounds good to me! I think data stabilization is a good test to see if the score function was trained correctly.

I have also noticed that the score function is a bit fickle to train compared to standard regression.

hongkai-dai commented 1 year ago

Sorry what do you mean by data stabilization? Currently I test the learned score function by applying Lagenvin dynamics zₜ₊₁ = zₜ + ε/2s_θ(zₜ)+√ε noise, and see when I take many Langevin dynamics (like 1000 steps) does z look like coming from the training data distribution. Is that what you mean?

hjsuh94 commented 1 year ago

That's exactly right, although I've been simply using standard gradient descent!

hongkai-dai commented 1 year ago

Got it, thanks! I will try the version zₜ₊₁ = zₜ + ε/2*s_θ(zₜ) without the noise term, I think that corresponds to the standard gradient descent?