CyberAgentAILab / layout-dm

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation [Inoue+, CVPR2023]
https://cyberagentailab.github.io/layout-dm
Apache License 2.0
225 stars 24 forks source link

What q_posterior means? #9

Closed WuChannn closed 1 year ago

WuChannn commented 1 year ago

Hi~

there is "q_posterior" function in layout_dm/src/trainer/trainer/models/categorical_diffusion/constrained.py. I'd like to know what it means?

Also, in "forward" function in the same file, does "t" represent "T"s for each item in a batch and the shape of "t" is [batch_size]? all functions in this file with an argument "t=t" means "T"? for example: log_x0_recon = self.predict_start(log_x_t_full, t=t) means "given noisy x_T, predict x0"?

Looking forward to your reply. thx

@kurochan @kyamagu @ciela @naoto0804

naoto0804 commented 1 year ago

I appreciate your interest in our work!

q_posterior

Please refer to the paragraph around Eq.3 of our paper.

t

't' is an arbitrary integer number between T and 0. Please refer to the first paragraph of Sec 3.1.

WuChannn commented 1 year ago

@naoto0804 thx for your reply.

if 't' is an arbitrary integer number between T and 0, I didn't see any code relate to a loop from T to 0. So how we get x_T from x_0 and how we get x_0 from x_T?

Also, what t, pt = self.sample_time(b, device, "importance") means? the shape of 't' is [batch_size], and 't' is an arbitrary integer, I think it means assigning a 'T' for every item in a batch. Anything wrong?

Looking forward to your reply.

naoto0804 commented 1 year ago

we get x_T from x_0 and how we get x_0 from x_T?

what t, pt = self.sample_time(b, device, "importance") means?

You can easily see the definition here. You can see t is randomly sampled between 0 to T if method == "uniform". When method == "importance", it is improved version based on importance sampling

WuChannn commented 1 year ago

@naoto0804 hi~thx for your reply.

I know where to find these codes, but I have the following problems:

  1. when training, there is no code needs x_0 from x_T, so I am confused how we get x_0 from x_T when training if 't' is an arbitrary integer number between T and 0.
  2. why randomly sampled a 't' between 0 to T for every item in a batch? we only need to know the specific xt and xt-1, not the loop from 0 to T?

Looking forward to your reply.

naoto0804 commented 1 year ago

there is no code needs

Unfortunately, I cannot understand this part. Could you elaborate a bit more?

why randomly sampled a 't' between 0 to T for every item in a batch?

I just followed the previous work VQDiffusion, so actually I am not sure. (I guess evaluating different t at the same time might stabilize the importance sampling in the earlier phase of training?) There might exist a better choice.