Open kargarisaac opened 3 years ago
@kargarisaac Thanks for your request! I'm also looking at COMBO paper, and I found it looks good. Fortunately, d3rlpy already supports MOPO. So it won't take long to make it. I'll update this issue once d3rlpy supports COMBO.
@kargarisaac Thanks for your request! I'm also looking at COMBO paper, and I found it looks good. Fortunately, d3rlpy already supports MOPO. So it won't take long to make it. I'll update this issue once d3rlpy supports COMBO.
Thank you. I want to work on adding LOMPO too. Is there any template or document for contribution?
@kargarisaac Sounds nice! Actually, all we have for contributors now is this document. https://github.com/takuseno/d3rlpy/blob/master/CONTRIBUTING.md
Any kinds of contributions will be appreciated. And, you can freely ask how we implement new algorithms.
Here is a combo implementation. But it doesn't support image inputs. https://agit.ai/Polixir/OfflineRL/src/branch/master
One question regarding adding image support for mopo. Are you working on that? I see a TODO part in the code for that. Do you know any model-based offline rl code that can handle image inputs?
@kargarisaac Currently, d3rlpy's MOPO does not support image inputs because there was not a benchmark for that. But, we can make it support image inputs since all algorithms are basically designed independently of observation shape. One tricky part is that we need to automatically determine deconvolution layers at the last of the dynamics model.
@kargarisaac It's very late. But, I've prototyped COMBO for vector observations. https://github.com/takuseno/d3rlpy/blob/master/d3rlpy/algos/combo.py I did not test it yet. But, the implementation should not be far from the paper.
@takuseno Thank you. sounds great :)
@kargarisaac It's very late. But, I've prototyped COMBO for vector observations. https://github.com/takuseno/d3rlpy/blob/master/d3rlpy/algos/combo.py I did not test it yet. But, the implementation should not be far from the paper.
I test COMBO and get very bad results over medium tasks from d4rl (across 3 random seeds), e.g., the evaluted return gives even negative in some tasks. Could you help me with that?
Is your feature request related to a problem? Please describe. Model-based offline RL algorithms which are able to handle image inputs are necessary for some environments.
Describe the solution you'd like Adding an implementation of algorithms like LOMPO and COMBO would be great. In the papers, they mention that these are based on MOPO implementation which is implemented in TensorFlow.