irom-lab / dppo

Official implementation of Diffusion Policy Policy Optimization, arxiv 2024
https://diffusion-ppo.github.io
MIT License
141 stars 13 forks source link

Possible bug introduced in commit (2ddf63b) #5

Closed acl21 closed 1 week ago

acl21 commented 1 week ago

Hi @allenzren, firstly, thank you for such a nicely written code repository.

I suspect you introduced a bug in your recent commit 2ddf63b. You can see here that the forward function of model.diffusion.diffusion.DiffusionMLP is not implemented anymore.

Is this expected behaviour?

Thanks in advance.

allenzren commented 1 week ago

Hi @acl21, thanks for the message. I had a major refactor of the code last week to add some new functionalities, and also I cleaned up the code a bit. I deprecated the forward function in the Diffusion class since the code never uses the base class for running the policy inference. If you are running DPPO, the forward function should be the one here. Does this make sense?

I guess one might want to evaluate the pre-trained policy only, but that can also be done with running the fine-tuning script for one iteration only (the first iteration is running eval).

Btw it is model.diffusion.diffuion.DiffusionModel, not model.diffusion.diffusion.DiffusionMLP

acl21 commented 1 week ago

Ah, that makes sense. Yes, indeed, I wanted to evaluate a pre-trained policy (on a different dataset - CALVIN). Thank you for the quick clarification.

allenzren commented 1 week ago

Cool, yea I think running eval with a separate script would be helpful. Let me add the implementation so it would be easier to run eval and also avoid the GPU overhead from making copy of the policy, for example.

allenzren commented 1 week ago

I just added the evaluation agents and re-implemented forward in the base diffusion class for running eval. See c9f24ba0c3f9092c000830a10004453bc40886d3