Kinds-of-Intelligence-CFI / animal-ai-python

Animal-AI Python
https://github.com/Kinds-of-Intelligence-CFI/animal-ai
MIT License
2 stars 0 forks source link

[MAJOR] Clean-up / Implementing subset of ml-agents package #6

Closed alhasacademy96 closed 6 months ago

alhasacademy96 commented 7 months ago

Proposed change(s)

The PR introduces significant changes to animal-ai-python:

  1. It revokes the dependency of ml-agents package as a whole and implements only ml-agents.envs sub package. In doing so, the package becomes substantially lighter and faster to download/install. Furthermore, the package reduces the chances of dependency issues the user may experience, such as in the past.

  2. The package has new 'core tests' folder, which must be passed before any new PR (this PR included). The core tests try to test out the functionality of the foundational code, and to make sure nothing breaks with the changes/additions/deletions brought by the PR. A series of tests are added but more will be added in the near-future. In addition, we now implement CI/CD when publishing new versions of animal-ai-python, such as on PyPI. The tests are also included in the workflow which are automated and fully embedded.

  3. With the help of the team (Wout, Kozzy), the repo is organized and made modern by utilizing on PEP 440 standards (1).

  4. Added a new script update_dependencies, to automate updating dependencies when required more efficiently (see script for comments). It is called only when attempting to publish to PyPI. If this fails, the error is caught and outputted (regarding dependency issues).

  5. Cleaned up README.md for a fresh look.

  6. Cleaned/renamed some scripts for readability and organization.

A note to bear is that with the changes made especially on only using ml-agents.envs as opposed to ml-agents, this and all future versions of animal-ai-python are backwards incompatible.

Useful links (Github issues, ML-Agents forum threads etc.)

(1) - (https://sethmlarson.dev/pep-440)

Types of change(s)

Checklist

Other comments

All tests under test/core have passed my unit and functional testing for this PR.

I've also tested the workflow CI/CD locally and everything works. (used hatch to build and staged publishing with twine).

Finally, This PR is still a WIP so more tests/documentation (where necessary) will be added.

Screenshots (if any)

wschella commented 7 months ago

This is quite hard to review, since there are no commit messages and (a lot of) formatting changes are mixed with behavior / code changes.

I won't block the merge or mingle too much, as I am still on sick leave.

One important comment tho: the update-dependencies script is invalid. All dependencies should be installed at the same time, otherwise pip will not guarantee that the subdependencies are conflict-free (i.e. they might be overridden by each install). In any case, do existing tools not handle this already, e.g. pip install --upgrade .?

alhasacademy96 commented 7 months ago

There are commit descriptions as I like to keep the commit messages simple and short

wschella commented 7 months ago

Apologies, I now see they are in the message instead of the title.

alhasacademy96 commented 7 months ago

The purpose of the update_dependencies.py script is to update the dependencies only during publishing to pypi, with additional safety checks. They are all updated at the same time if any of the allowed dependencies are not valid, hence the safety feature.

alhasacademy96 commented 7 months ago

also please dont feel like you have to review as i know you are still recovering. i only included you as this is your branch and i felt like you might want to review just in case :)

I've also added a bunch of changes so it was important to add you in here :)

@wschella

wschella commented 7 months ago

The purpose of the update_dependencies.py script is to update the dependencies only during publishing to pypi, with additional safety checks. They are all updated at the same time if any of the allowed dependencies are not valid, hence the safety feature.

I don't understand. Is this executed locally or in the CI pipeline?

If for CI: Our dependencies are not bundled in our pypi build, so the exact versions don't matter (version resolving will happen on the user end). What's the benefit over just installing from the pyproject.toml (which in CI will also pull the latest versions)?

If for locally: what's the difference as opposed to just editing pyproject.toml. and installing from that? Now it's two package versions to keep in sync.

In any case pip install package1==v1 && pip install package2==v2 && ... (what you are doing) is different than pip install package1==v1 package2==v2 .... We need the latter to have consistent dependencies.

alhasacademy96 commented 7 months ago

An excellent point you brought up. Here's my understanding and explanation:

So, update_dependencies.py vs. pyproject.toml:

_updatedependencies.py:

This script can be used to enforce specific versions of dependencies, for example during testing or experimentation. It's useful when the CI pipeline or local development environment uses specific versions, rather than the latest ones, which we use specific ones. I just thought that it might be handly when using tests against specific versions of dependencies or even when we actually have compatibility issues regarding i.e. numpy.

pyproject.toml:

For consistent dependency resolution, it's indeed better to install all dependencies in a single command like you suggested. This can be achieved by running pip install -r requirements.txt (which we're not implementing) or pip install (which installs the current package and its dependencies as specified in pyproject.toml (formerly setup.py for us).

I do understand it's another layer of management on top of pyproject.toml but i just wanted to add the flexibility for more controlled testing environments as pyproject.toml uses the latest version of packages. Also, bear in mind that if there is nothing to be changed in update_dependencies.py (specifically allowed_versions), then the versions of dependencies will be identical to that of defined in pyproject.toml.

This can be problematic as we are using very specific versions of, say, ml-agents, numpy, protobuf, etc, and can't simply use their latest versions as it arises compatibility issues, and will probably continue to do so even if we migrate to ml-agents.env 1.0.0.

However, this is the reason for reviews and collaboration and it doesnt mean we will adopt an extra layer of management automatically (which, i will be managing) so i didnt think it would affect the end user but rather us. It's for the flexibility that i proposed to using this extra layer.

I hope i explained my reasoning for the changes.

P.S.: I am looking into implementing Docker and this change would certainly compliment it (although pyproject.toml is still a streamlined process regardless).