[REQUEST] Adding model-based offline RL with image inputs like LOMPO and COMBO

takuseno / d3rlpy

An offline deep reinforcement learning library

https://takuseno.github.io/d3rlpy

MIT License

1.29k stars 230 forks source link

[REQUEST] Adding model-based offline RL with image inputs like LOMPO and COMBO #58

Open kargarisaac opened 3 years ago

kargarisaac commented 3 years ago

Is your feature request related to a problem? Please describe. Model-based offline RL algorithms which are able to handle image inputs are necessary for some environments.

Describe the solution you'd like Adding an implementation of algorithms like LOMPO and COMBO would be great. In the papers, they mention that these are based on MOPO implementation which is implemented in TensorFlow.

takuseno commented 3 years ago

@kargarisaac Thanks for your request! I'm also looking at COMBO paper, and I found it looks good. Fortunately, d3rlpy already supports MOPO. So it won't take long to make it. I'll update this issue once d3rlpy supports COMBO.

kargarisaac commented 3 years ago

@kargarisaac Thanks for your request! I'm also looking at COMBO paper, and I found it looks good. Fortunately, d3rlpy already supports MOPO. So it won't take long to make it. I'll update this issue once d3rlpy supports COMBO.

Thank you. I want to work on adding LOMPO too. Is there any template or document for contribution?

takuseno commented 3 years ago

@kargarisaac Sounds nice! Actually, all we have for contributors now is this document. https://github.com/takuseno/d3rlpy/blob/master/CONTRIBUTING.md

Any kinds of contributions will be appreciated. And, you can freely ask how we implement new algorithms.

kargarisaac commented 3 years ago

Here is a combo implementation. But it doesn't support image inputs. https://agit.ai/Polixir/OfflineRL/src/branch/master

One question regarding adding image support for mopo. Are you working on that? I see a TODO part in the code for that. Do you know any model-based offline rl code that can handle image inputs?

takuseno commented 3 years ago

@kargarisaac Currently, d3rlpy's MOPO does not support image inputs because there was not a benchmark for that. But, we can make it support image inputs since all algorithms are basically designed independently of observation shape. One tricky part is that we need to automatically determine deconvolution layers at the last of the dynamics model.

takuseno commented 3 years ago

@kargarisaac It's very late. But, I've prototyped COMBO for vector observations. https://github.com/takuseno/d3rlpy/blob/master/d3rlpy/algos/combo.py I did not test it yet. But, the implementation should not be far from the paper.

kargarisaac commented 3 years ago

@takuseno Thank you. sounds great :)

dmksjfl commented 3 years ago

@kargarisaac It's very late. But, I've prototyped COMBO for vector observations. https://github.com/takuseno/d3rlpy/blob/master/d3rlpy/algos/combo.py I did not test it yet. But, the implementation should not be far from the paper.

I test COMBO and get very bad results over medium tasks from d4rl (across 3 random seeds), e.g., the evaluted return gives even negative in some tasks. Could you help me with that?