apexrl / Diff4RLSurvey

This repository contains a collection of resources and papers on Diffusion Models for RL, accompanying the paper "Diffusion Models for Reinforcement Learning: A Survey"
Apache License 2.0
420 stars 17 forks source link

Regarding Crossway Diffusion #1

Closed LostXine closed 11 months ago

LostXine commented 11 months ago

Hello,

Thanks for making this comprehensive survey and it is our great pleasure to see our recent work Crossway Diffusion has been introduced as well. However, I've noticed a slight oversight in the way our paper has been covered in your paper as well as this repo. It would be great if they could be amended.

  1. As you mentioned in this repo, Crossway Diffusion is an extension of Diffusion Policy for imitation learning, which uses diffusion models as policies. So in Table 1 of your paper, our method should be moved to the second row, and the title of the table might need to be revised as well.
  2. Crossway Diffusion relies on an auxiliary reconstruction loss to guide the intermediate representation from the diffusion model. I believe it would make your paper more informative if you could cover this information in the table.
  3. Crossway Diffusion has been rejected by CoRL 2023 and currently, it is under review for another conference. Though I'm happy to see you put it with CoRL, it is unfortunately not the fact :(

Thank you,

zbzhu99 commented 11 months ago

Hello @LostXine,

We sincerely appreciate your attention to our survey paper, and thank you for providing the valuable feedback.

Regarding the first point, in Table 1, we only label the imitation learning methods which learn from human demonstrations as imitation and use the term offline to refer to both offline imitation learning and offline RL methods that learn from other sources. We are sorry for the confusion that has arisen due to our oversight. We have modified the notes on applications in Table 1 to be strictly aligned with those in Section 5.

Regarding the "roles" column, Crossway Diffusion should be categorized as the planner method and thus in the right place, and we will relocate the entry for Diffusion Policy [1] under the planner row, too. Although Diffusion Policy does not plan in the state space, we still prefer to view it as an action planner. Similar to state planners, this kind of action planner leverages the diffusion model to plan for multiple steps simultaneously, enjoying better temporal consistency when compared to autoregressive methods. We will elaborate on our categorization principles of different roles more specifically in the revised manuscript.

For the second point, we do not want to include too many details of each paper in Table 1 due to space limit. But we are willing to highlight the specific contribution of Crossway Diffusion in the main text.

We plan to upload a revised version of the survey paper by the end of this month, where the changes on the first and second points will be integrated.

For the third point, we have already fixed the info in the GitHub repo. We hope your paper receives good luck in its review for the upcoming conference, and best wishes for a successful acceptance!

Best regards, Zhengbang

[1] Chi C, Feng S, Du Y, et al. Diffusion policy: Visuomotor policy learning via action diffusion[J]. arXiv preprint arXiv:2303.04137, 2023.

LostXine commented 11 months ago

Hi @zbzhu99 ,

Thank you so much for your prompt reply. Your solution looks perfect to me and I appreciate all your great efforts and kind words.

Best,

zbzhu99 commented 10 months ago

The updated manuscript is now available at https://arxiv.org/abs/2311.01223v2.