LucasAlegre / morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.
https://lucasalegre.github.io/morl-baselines
MIT License
271 stars 44 forks source link

Generalization of the observations for GPI-LS #81

Closed AdrienBolling closed 7 months ago

AdrienBolling commented 9 months ago

Here is my modification of the library to accept all types of Observations More things are left up to the person using the library, especially through the implementation of the generic Observation class (that could admitedly be moved to MO-Gym alongside the associated wrapper) This has been tester with the Minecart file to see if it was still compatible and it seems to be as it ran without issues.

The principle is just that all observations are just considered of type Observation and are handled as array of objects to keep as much of the previous code as possible. The conversion to tensor is left to the discretion of the user, and made in the forward function of the q_net

This PR also allows people to use custom q_net with their observations (Observations that are not a wrapper of a np vector will require a custom q_net by default)

The basic Observation class I made should be pretty compatible with a good number of observations, although not optimally.

Please don't take into account the changes made to enveloppe, these were old tweaks and mluch less transferable.

I think this implementation could be deployed in the entire library without too much hassle

Please excuse me for the mess of a PR this is, it is the first contribution I'm doing of the sort of my life

I have also not really touched the ProbabilisticEnsemble part. I think a reasonnable option would be to leave the choice for the user to provide a custom model for the env simulation if he wishes to use a "complex" observation.

(This modification mainly stems from my need to use your library with DGL Heterographs observations)

AdrienBolling commented 9 months ago

I just noticed that I required a merge to your main branch. I'm not experienced with git except from a personnal use stadnpoint, so I don't know if you'll be able to redirect it into another branch or if I need to redo it towards another branch myself ?

LucasAlegre commented 8 months ago

Hi @AdrienBolling, I am not sure I understand the motivation for this PR. GPI-LS already supports vector and image observations. In case another type of observation would be used, this could be done by a wrapper converting to vector, right?

AdrienBolling commented 8 months ago

In my case : I am working with graph based observations (More precisely heterogeneous dynamic graphs)

So I have several issues with just using a flattened array :

ffelten commented 7 months ago

Hi Adrien 🙂,

I will close this since we do not want to overcomplexify the codebase for now. We believe there are more advantages in staying close to SB3/cleanRL (which only support Gymnasium's spaces) while MORL is still not so well known.

In case you need any more help on this, you can still message us on Discord :-).

Cheers,