anuragajay / decision-diffuser

288 stars 42 forks source link

Where is the code for history conditioning proposed in the paper? #3

Open tinnerhrhe opened 1 year ago

tinnerhrhe commented 1 year ago

Decision Diffuser needs to maintain a history of length K and then condition on it, to constrain the plan to be consistent, which is described in the paper. However, I can not find corresponding codes to realize this trick. Thank you very much for answering this question!

Looomo commented 1 year ago

Hi, I was just reading the codes, so have you found the length K history? I cant find it either.

tinnerhrhe commented 1 year ago

Hi, I was just reading the codes, so have you found the length K history? I cant find it either.

I haven't found it yet. Still waiting for the authors' response.

XueruiSu commented 1 year ago

I still have some confuse for the conditional generation of diffusion model. image

I found that the config file in this repo used the "TemporalUnet" model as diffusion model. So I checked how they use this model to do the conditional generation of state in RL. But I found the param "cond"(I think this papam means condition parameters) of forward function in the "TemporalUnet" model haven't been used in the all code of the "TemporalUnet" model. So I think the author may not supply the right config file so the "TemporalUnet" model is not the conditional generation model. Or some other thing I haven't found. Nevertheless, waiting for the author's response too.

image

Best wishes!

Looomo commented 1 year ago

cond actually means s_t (current observation), which is already put in x[ : , 0, actiondim: ]. So yes, param cond is useless in Unet. See https://github.com/anuragajay/decision-diffuser/blob/01ce528c30b4733dc59aa6203e46ec165561158d/code/diffuser/models/diffusion.py#L260C69-L260C69 for details.

michaeljteng commented 1 year ago

i think the history conditioning happens here - https://github.com/anuragajay/decision-diffuser/blob/01ce528c30b4733dc59aa6203e46ec165561158d/code/diffuser/models/diffusion.py#L260

^this will call helper:

def apply_conditioning(x, conditions, action_dim):
    for t, val in conditions.items():
        x[:, t, action_dim:] = val.clone()
    return x

and the "conditions" that are passed to it are from the dataset parametrization you choose, which can be sequence dataset (condition on first obs): https://github.com/anuragajay/decision-diffuser/blob/01ce528c30b4733dc59aa6203e46ec165561158d/code/diffuser/datasets/sequence.py#L75

or the other classes which pass in different ones like longer history or goaldataset

LiuTaowen-Tony commented 9 months ago

i think the history conditioning happens here -

https://github.com/anuragajay/decision-diffuser/blob/01ce528c30b4733dc59aa6203e46ec165561158d/code/diffuser/models/diffusion.py#L260

^this will call helper:

def apply_conditioning(x, conditions, action_dim):
    for t, val in conditions.items():
        x[:, t, action_dim:] = val.clone()
    return x

and the "conditions" that are passed to it are from the dataset parametrization you choose, which can be sequence dataset (condition on first obs):

https://github.com/anuragajay/decision-diffuser/blob/01ce528c30b4733dc59aa6203e46ec165561158d/code/diffuser/datasets/sequence.py#L75

or the other classes which pass in different ones like longer history or goaldataset

I believe this is inherited from its previous work "Planning with diffusion for flexible behavior synthesis". It is used for setting the the initial and final state for planning.

I personally don't think this is the way that the auther suggests to add constraints.

1180300215 commented 1 month ago

so ,Has anyone solved the problem