Miffyli commented 2 years ago

115 but with a fresh start.

Currently has few environments. ViZDoom, being a "visual" learning env, is not the best fit for entity-gym but the agent should still learn something.

TODO:

[x] Add poetry/etc stuff
[x] Add quick readme docs
[x] Get positive results in all envs (try out relpos encoding, study params, double-check env)

Single runs with different envs:

Defend the Center: https://wandb.ai/entity-neural-network/enn-ppo/runs/2wabrli9
Health gathering: https://wandb.ai/entity-neural-network/enn-ppo/runs/37rfxti0
Health-gathering supreme: https://wandb.ai/entity-neural-network/enn-ppo/runs/ok8sx1w1 (this one probably fails to learn because walls are not included as part of entities etc.)

Miffyli commented 2 years ago

@cswinter

This is now ready for look-over/review. I tried to satisfy mypy but ran into headaches I did not understand why they crept up and how to fix (e.g. not matching signatures; complaining about variable being optional [which it indeed is, but not in practice], and trying to fix it led to mypy crashing).

Miffyli commented 2 years ago

1) Hmm health gathering supreme (HGS) is a more difficult environment which appears in papers using ViZDoom (e.g. Fig. 6 in the sample-factory paper). I did not perform any hyperparam tuning so you can probably squeeze more performance out of this. I also think it won't be a fair comparison by any means, as this code has access to exact locations of objects, but at the same time does not know where walls are 😅 .

2) I did struggle quite a bit making this env tbh, but that is because I am not in the mind-set of "multiple entities being controlled". My mind was in "single, controlled player and bunch of other entities". For example, all the action masking and "ids" (which are passed in Observation) are opaque to me and I do not know where they are needed (probably not needed for ViZDoom)? However, I just realized there was a tutorial.md available under enn-gym, so that would have helped me out probably :D. The examples were a good starting point, too, and thank you for making so many of them!

Also I am not 100% sure if I am using translation thing. It seems very important for a use-case like this, and any docs/feedback on if it is applied correctly would be nice :)

I will try the mypy fixes. I did try some assertions already, but these led mypy to crashing (but code did work). Will try again.

cswinter commented 2 years ago

Thanks for the feedback! It could be worth adding a special case for environments with a single actor since that's a pretty common use case and could remove a fair bit of complexity. You're actually using the translate correctly, but we could certainly use more (any!) docs on all the hyperparameters.

It looks like "defend the center" is doing pretty OK already, it's a bit hard to tell but it looks like the paper baselines take around 20M to reach the same performance as your run reached in 0.5M. Is that a good baseline to compare to? I suppose seeing through walls would still give an unfair advantage on that task as well?

Is there any way to get the wall entities, or are those just not exposed by the environment? I suppose with the difference in visibility you can't make a good comparison anyway, but still nice to see that things are basically working.

vwxyzjn commented 2 years ago

Hey, @Miffyli I just fought with mypy recently so tried something :)

Miffyli commented 2 years ago

Hey, @Miffyli I just fought with mypy recently so tried something :)

Thanks @vwxyzjn ! I was mentally prepared to struggle, but instead now spent the minutes saved enjoying my coffee and sun :blush:

It looks like "defend the center" is doing pretty OK already, it's a bit hard to tell but it looks like the paper baselines take around 20M to reach the same performance as your run reached in 0.5M. Is that a good baseline to compare to? I suppose seeing through walls would still give an unfair advantage on that task as well?

Yup I am going to say it is unfair, as the player is able to see behind him in this setup (entities) vs. vision based :D.

Is there any way to get the wall entities, or are those just not exposed by the environment? I suppose with the difference in visibility you can't make a good comparison anyway, but still nice to see that things are basically working.

Walls are exposed as well, as some sort of list of points and lines. I was planning to add them but figured the agent would have too hard time learning them, there are too many points/lines defining walls (well in hundreds) and in static environments the agent should be able to learn which parts of the map it can bypass (although constantly updated normalization may make this difficult)

I feel like a combination of tiny bit of vision + entities would work the best. For example, the horizontal line from the middle of the depth map, so agent knows where walls are. Should I try to add walls or this vision thing?

Yeah it doesn't look like these methods are actually used anywhere, so just deleting is sufficient. The create_vizdoom_env already implements the methods to the class so the self.class.x = are not necessary.

Hmm I am bit confused: removing obs_class etc things fails because we are creating a DoomEntityEnvironment in create_vizdoom_env, and making them @classmethods fails as well. As far I understood it, the only option is to do refactoring similar to Griddly's, where the core env class is an abstract class and the obs/action_space methods are properly added in the create_vizdoom_env.

Any possibility to make obs/action_spaces non-classmethods (instance methods?)? I feel it adds an unnecessary layer of complexity for config-based envs like Griddly and ViZDoom ^^

cswinter commented 2 years ago

I feel like a combination of tiny bit of vision + entities would work the best. For example, the horizontal line from the middle of the depth map, so agent knows where walls are. Should I try to add walls or this vision thing?

That sounds right, we don't actually have a vision network yet, but we could probably add something simple like that which can expressed as features. The other approach might be to just filter down to the list of nearby wall entities.

Forgot about the access of obs_space in create_vizdoom_env, yeah I guess you still do need it, just has to have a different name than the classmethods. This is quite a bit of a recurring pain point, not completely sure what the best solution is, but something about the API probably needs to be changed here.

Miffyli commented 2 years ago

I added the info of N closest walls (also updated code only give info on N closest objects), and now code runs at reasonable pace and I am getting results similar to my original HealthGatheringSupreme` experiments (Figure 4, mid row, second from left, blue line) using vision! Again, this is only from one run, but previously I did not get above 4.0 score in any of multiple runs, and this time I got this on my first trial :)

If mypy fix is satisfactory (type: ignore on classmethod), feel free to merge. I can fix that, too, if needed.

global_step=493568 charts/episodic_return=5.499166666666656  charts/episodic_length=162.5  charts/episodes=12  meanrew=0.03999999910593033
SPS: 585
global_step=495616 charts/episodic_return=7.03555555555553  charts/episodic_length=200.88888888888889  charts/episodes=9  meanrew=0.03999999910593033
SPS: 585
global_step=497664 charts/episodic_return=5.799999999999997  charts/episodic_length=170.07692307692307  charts/episodes=13  meanrew=0.03999999910593033
SPS: 585
global_step=499712 charts/episodic_return=6.109999999999979  charts/episodic_length=177.83333333333334  charts/episodes=12  meanrew=0.03999999910593033
SPS: 585

Results on vision based PPO (blue line on the figure on the right).

cswinter commented 2 years ago

Very nice, thank you @Miffyli!

Miffyli commented 2 years ago

Uh, this never ends :sweat_smile: . ViZDoom has a list of library dependencies. Not sure why the tests passed earlier, though, as it seems that the CI is not configured to have these.

Doing following would probably work for workflows, but are we ok having vizdoom as a default dependency that will fail unless users install Boost and all that? I feel like it should be optional.

- name: Install vizdoom dependencies
  run: sudo apt-get install -y cmake libboost-all-dev libsdl2-dev libfreetype6-dev libgl1-mesa-dev libglu1-mesa-dev libpng-dev libjpeg-dev libbz2-dev libfluidsynth-dev libgme-dev libopenal-dev zlib1g-dev timidity tar nasm

vwxyzjn commented 2 years ago

In that case, i suggest either making it an optional dependency (e.g, https://github.com/vwxyzjn/cleanrl/blob/master/pyproject.toml#L18-L55) or remove it from poetry and install from pip.

entity-neural-network / incubator

Add (Viz)Doom environments, take #2 #201

115 but with a fresh start.