cogment / cogment-verse

Research platform for Human-in-the-loop learning (HILL) & Multi-Agent Reinforcement Learning (MARL)
https://cogment.ai/cogment_verse
Apache License 2.0
76 stars 14 forks source link

156 Migrate to new gym step API #157

Closed wduguay-air closed 1 year ago

wduguay-air commented 1 year ago

Context

Cogment-verse has an annoying amount of warnings coming from third party libraries when launching an experiment. This PR removes them.

Solution

  1. Moving to the new gym step API for creating the env because a temporary wrapper support is provided for the old code and it will cease to be backward compatible very soon.

The main ramifications are that the step function output will go from 4 output:

next_state, reward, done , info = env.step(action)

to 5, by splitting the done flag into the flags terminated and truncated.

next_state, reward, terminated, truncated , info = env.step(action)

Truncated is meant for episodes that were cancelled for a exceeding a maximum number of steps, a timeout or failed execution. On the other hand, terminated is for eposides that ended under normal environment conditions (game won, game lost, etc). For our use of the done flag, we can use terminated or trunctated as a replacement.

  1. Fix setuptools version <67.3.0 because version 67.3.0 introduced a new way to declare namespaces that is not implemented in most libraries yet and raises an annoying warning for multiple libraries. Example git issue.

Other changes

Bug fix for deserialize_model function. Extra arguments were not removed for cases where model_iteration is not -1. Bugfix for PR https://github.com/cogment/cogment-verse/pull/144

closes #156

wduguay-air commented 1 year ago

Added proper test for connect four when selecting a specific model iteration (not -1) for the SimpleDQNActor.