microsoft / med-deadend

Code for the Medical Deadend Paper at NeurIPS 2021
MIT License
46 stars 18 forks source link

Dead-end Discovery connection with lifegate #7

Open dmasamba opened 3 weeks ago

dmasamba commented 3 weeks ago

Hello, I am trying to understand the connection between the DeD and lifegate. How is the data trained using DeD being tested on the lifegate environment? I was looking on the provided scripts to see how the lifegate environment is using the VD and VR obtained from training the offline RL method but I can't seem to understand how it works clearly. Any explanation will be greatly appreciated. Thank you so much.

twkillian commented 2 weeks ago

Hi @dmasamba, thanks for reaching out.

Before I dive in and help answer your questions, I must ask whether you read the designated sections in the paper and appendix?

An additional resource can be found in my follow-up 2023 paper where we used LifeGate in a slightly different manner.

We have attempted to be pretty clear about how we used the Lifegate toy domain with DeD as an instructive demonstration of what $V_D$ and $V_R$ can learn from offline demonstration data. Ultimately, the data used to learn $V_D$ and $V_R$ can be gathered however you want. We largely generated the data using random walk trajectories, hoping that we would have as full of coverage of the state space as possible. It is important to note however that we did not use online RL at all in the experiments using LifeGate, everything was done in an offline setting.

dmasamba commented 1 week ago

Hello @twkillian, Thank you so much for getting back. I'm sorry I didn't read the sections concerning Lifegate in the paper before asking that question but I went through it afterwards and it helped me understand better what was going on. Thank you so much for your feedback and explanation.

twkillian commented 1 week ago

Great, I'm happy that I could help!