Use poetry to lock dependencies

This PR uses poetry to offer a more seamless installation experience and lock dependencies for better reproducibility. To try it out, please install poetry and run

poetry install -E atari
poetry run python runner.py --train --file rl_games/configs/atari/ppo_pong.yaml
poetry run python runner.py --play --file rl_games/configs/atari/ppo_pong.yaml --checkpoint nn/PongNoFrameskip.pth

poetry install -E brax
poetry run pip install --upgrade "jax[cuda]==0.3.13" -f https://storage.googleapis.com/jax-releases/jax_releases.html
poetry run python runner.py --train --file rl_games/configs/brax/ppo_ant.yaml
poetry run python runner.py --play --file rl_games/configs/brax/ppo_ant.yaml --checkpoint runs/Ant_brax/nn/Ant_brax.pth

poetry install -E mujoco
poetry run python runner.py --train --file rl_games/configs/mujoco/humanoid.yaml

Motivation

Oftentimes we do pip install mypackage, but this operation only happens in the local computer and it's easy for us developers to forget pin the dependency of mypackage in the setup.py. To make matters worse, the dependencies of mypackage is also not pinned. By using poetry, we would do poetry add mypackage and poetry will automatically pin all the dependencies to poetry.lock, ensuring reproducibility.

As a demo, poetry locks the version of the dependency of dependency of dependecies, unlike the regular pip install tensorboard==2.8.0 that pins tensorboard's version.

It would especially improve the current setup.py to help install optional dependencies, which are not locked.

      install_requires=[
            # this setup is only for pytorch
            # 
            'gym>=0.17.2',
            # 'gym[atari]',
            # 'gym[box2d]',
            'torch>=1.7.0',
            'numpy>=1.16.0',
            'ray>=1.1.0',
            'tensorboard>=1.14.0',
            'tensorboardX>=1.6',
            'setproctitle',
            'psutil',
            'pyyaml'
            # Optional dependencies
            # 'opencv-python>=4.1.0.25',
            # 'tensorflow-gpu==1.14.0',
            # 'gym-super-mario-bros==7.1.6',
            # 'pybullet>=2.5.0',
            # 'smac',
            # 'dm_control',
            # 'dm2gym',
      ],

Currently the isaacgym-related workflow has roughly the following instructions:

pip install -e isaacgym
pip install -e isaacgymenvs
pip install -e rl-games

Such instructions are not reproducible if written to the README.md file. People running this command a month from today could produce drastically different build environments. And this often breaks code: if issacgym introduces a breaking change, issacgymenvs and some-rl-libraries code will break with some arcane error such as module X is not found.

Solution for dependency management

Handling isaacgym, isaacgymenvs and rl-games together can be done by creating a mono repo that links everything together:

[tool.poetry]
name = "mono-rl"
version = "0.1.0"
description = ""
authors = ["Costa Huang <costa.huang@outlook.com>"]

[tool.poetry.dependencies]
python = ">=3.7.1,<3.11"
isaacgym = {path = "./isaacgym", develop = true}
rl-games = {path = "./rl-games", develop = true}
isaacgymenvs = {path = "./isaacgymenvs", develop = true}

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

This way, isaacgym, isaacgymenvs and rl-games all individually have a lock files, and this mono-rl repo locks dependencies for all three packages while maintaining editable mode via develop=true. So ultimately we can achieve the same development efficiency but obtain much better reproducibility across the board. This PR introduce poetry to issacgymenvs and I will follow up with PRs in other projects.

What if we still want a `requirements.txt`

This is not a problem. We can do poetry export -f requirements.txt requirements.txt to pin dependencies the old fashion way. Further we can automate this using pre-commit hooks that always re-export requirements.txt should the lock file changes.

Why not just use a `requirements.txt`

Because there is no way to enforce developers to do pip freeze > requirements.txt every time they made dependencies changes. Oftentimes the requirements.txt gets obsolete and unmaintainable. Poetry enforces dependency pinning every time there is a dependency change.

Downside & limitations

Locking dependencies can take a long time (my longest lock was 3 hours) as poetry search through dependencies version that satisfy all constraint (to my knowledge an NP-complete problem) and python lack certain dependency resolving support, further slowing down the process. That said, there are advanced techniques to mitigate this...

Another thing is poetry cannot distinguish the same torch version built with different cuda versions (10, or 11), causing a series of problems, but we can avoid this problem by not adding torch to the lock file. That way our installation would look like

poetry install
# install cuda related dependencies
poetry run pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

Denys88 / rl_games