nicklashansen / tdmpc

Code for "Temporal Difference Learning for Model Predictive Control"
MIT License
346 stars 55 forks source link

A Question about mpc and td-mpc #3

Closed wenzhoulyu closed 2 years ago

wenzhoulyu commented 2 years ago

Thank you for your apply a few days ago. I'm doing something about control the medicine dose with rl. I woder whether the td-mpc can against the mpc with biologic model. If possible, what's the advantage of td-mpc compared with mpc in this field. It's just my own question.

wenzhoulyu commented 2 years ago

In my opinion, the td-mpc combines the mpc planning with rl infinite steps return which is the adavantage against the origin mpc.

nicklashansen commented 2 years ago

Correct, one advantage is the return estimates. Another differentiator is the way the model is learned and used. Our model is easy to learn and performs well in practice, but because the model in TD-MPC is a latent model with no reconstruction objective, it might be harder to interpret and enforce e.g. safety constraints in TD-MPC compared to a state-based model as commonly used in MPC, so that is something to keep in mind.

wenzhoulyu commented 2 years ago

By the way I have an idea to combine the biological model which neural network is hard to study with TD-MPC's method to against the oringinal mpc with such model. Or comebine the biological model with neural network to learn a better task latent dynamics. I hope the idea can make it. Thank you for your excellent work and patient reply.

nicklashansen commented 2 years ago

Happy to help, good luck with your project!