JuliaReinforcementLearning / ReinforcementLearning.jl

A reinforcement learning package for Julia
https://juliareinforcementlearning.org
Other
583 stars 108 forks source link

test/runtests.jl empty (+ arch discussion) #843

Closed filchristou closed 5 months ago

filchristou commented 1 year ago

Quite schockingly, I realised test/runtests.jl is empty. What's the current way to test this package ?

jeremiahpslewis commented 1 year ago

Each submodule has tests, see here for what gets tested by Github Actions.

filchristou commented 1 year ago

is there a way to trigger it locally before pushing it to github ?

jeremiahpslewis commented 1 year ago

I don't think so, but running it off of a branch / PR on Github works.

HenriDeh commented 1 year ago

You can activate a subrepo in the REPL then run the tests like this: ]activate ReinforcementLearningCore, ]test ReinforcementLearningCore (for the case of RLCore). Testing the entirety of the metapackage requires you to do this one by one. Maybe we could automate this. Careful that the structure of RL.jl will make it such that the released versions of the dependencies will be used and not your local one. So if you made changes to RLBase and want to test RLCore, do not forget to ]dev src/ReinforcementLearningBase

filchristou commented 1 year ago

has anyone tried a workflow with TestEnv.jl ? ]test is just too slow if many incremental changes are needed.

filchristou commented 1 year ago

Maybe we could automate this.

What if we create a separate environment in ReinforcementLearning.jl/test/ that will have its own Manifest.toml deving all the local passages. I myself am a fan of this approach and it might fit well (some discussion here https://julialang.zulipchat.com/#narrow/stream/225583-appreciation/topic/Run.2Ejl)

jeremiahpslewis commented 1 year ago

Want to put together a PR with some instructions on how to use it? Happy to try it out!

filchristou commented 1 year ago

I am rethinking a bit what are the requirements and scenarios to have such a local deving/testing machinery.

Summary

My understanding is that the general ReinforcementLearning (RL) module/package is just the umbrella handler of the (most recent) eligible version combinations of the submodules. Basically developing the RL.jl means updating the Project.toml versioning. This means a specific RL installed instance (add RL) most possibly uses submodules from different RL.jl versions.

Scenarios

Developing

When the user wants to make certain changes in the code base.

1. Developing a specific submodule (e.g. RLBase, RLCore, RLEnvironments, ...)

The user specifically wants to alter the functionality for a specific submodule.

Implementation and caveats

We must dev the specific submodule and let all others to default, i.e. the general registry. deving a submodule will freeze the version dependency to what currently is located in the Project.toml. Therefore, we must not dev all submodules, because versions might not even be compatible. But it might be as well that the general registry submodules cannot support the version of the current deved submodule of interest. This is desirable and gives a heads up to the user that current changes are not directly possible without deving another submodule, and possibly change its Project.toml etc.

2. Developing a combination of submodules

Implementation

dev all submodules of interest and continue as before.

3. Developing RL.jl

Implementation

Don't dev anything. Just change the RL/Project.toml as you wish.

4. Selecting commit to fork ?

This is something I am still confused about. Most julia packages have a linear package versioning development. This means that if people want to develop something in a package, they get the latest commit on the main branch and build upon it.

I think in the RL.jl universe it's possible that people want to develop older versions of submodules than what is currently in the main branch. Am I right ? This puts extra confusion: The user needs to fork an older RL.jl version, where the desired submodule version is located. Fortunately there are nicely organized tags to navigate around. But if the user wants to develop older versions for more than one submodule then s/he might need to fork 2 different RL.jl commits. This and versioning is so confusing.

Testing

The scenarios for testing are exactly the same as before. Testing requires some extra dependencies, so it should be a superset of dependencies of the corresponding deving scenario.

As a result, I think having a deving/testing 100% automatic machinery is not possible. The user must always specify what are the intentions.

Subpackage design

Sorry. but after quite some time of using the package it's still hard for me to completely digest this design. I don't know what the people in https://github.com/JuliaLang/Pkg.jl/issues/1233 are doing, what's the state and above all if it will be helpful for RL.jl. But currently I see a lot of troubles stemming out from this design. Basically the RL.jl umbrella introduces additional manual versioning restrictions, which the Pkg.jl could handle as well. In the end, we could still have it as a dummy package that @reexports all other packages (in the non sub-packing way). Furthermore, I don't think so that package extensions is even possible for subpackages at the moment. (?) And RL.jl desperately needs them: it's so much overhead have the user installing Zygote, CUDA and Flux just to do a silly tabular RL experiment.

You are much more experienced with this package and I am happy to hear your thoughts. I am happy to be persuaded.

HenriDeh commented 1 year ago

I think in the RL.jl universe it's possible that people want to develop older versions of submodules than what is currently in the main branch. Am I right ?

I guess it's possible, especially in the current situation where a lot of stuff is no longer on the main branch. And I agree that it's probably a mess to work with. Good tags are certainly essential for them.

Subpackage design

The current subpackage design has given me so many headaches. If it was up to me, I'd totally be in favor of a complete reorganization of the project into something more akin to Optimization.jl, which serves a somewhat similar (but not quite the same) purpose of being a metapackage. That is, it's just a meta package and it does not "contain" the subpackages, it only imports them. Each RLXxx.jl would be its own package. The thing is, it's not up to me, it's Jun's package originally and I don't know how disrupting the changes we do can be.

Furthermore, I don't think so that package extensions is even possible for subpackages at the moment. (?) And RL.jl desperately needs them: it's so much overhead have the user installing Zygote, CUDA and Flux just to do a silly tabular RL experiment.

That's a good argument in favor of a standard design yes. I don't know if subpackages are extension compatible but if it is, it may be painful to implement.

I am happy to be persuaded.

I think I agree with you so I have nothing to persuade you about.

filchristou commented 1 year ago

something more akin to Optimization.jl

Doesn't this look more like the effort behind https://github.com/juliareinforcementlearning/commonrlinterface.jl ?

BTW the Makie.jl guys also follow a subpackage structure. I think they get away with that because they have very simple inter-dependencies:

Some ongoing discussion about that on their discord now ! https://discord.com/channels/996787732149981214/1000721417941295205

zsunberg commented 1 year ago

Doesn't this look more like the effort behind https://github.com/juliareinforcementlearning/commonrlinterface.jl ?

CommonRLInterface is more like MathOptInterface.jl than Optimization.jl. It is an interface for RL environments that solvers can use, but does not aim to provide any sort of solve or learn function.

I think it is important to keep the environment interface separate from policies, agents, and environment implementations. That way people are free to implement their own learning algorithms that have different strengths, i.e. someone could make a clean RL package that has easy to read and modify code and someone else could make a high performance PPO implementation that works on a wide range of environments. A tabular learning algorithm then does not need to depend on Flux for example.

This is similar to the function of gym/gymnasium in python. Everyone writes environments that look the same, but there are multiple repositories of algorithms out there.

Optimization.jl is trying to bring together many efforts that are owned and maintained with passion by many different individual developers. Currently, ReinforcementLearning.jl tries to contain everything. This has some advantages, but I think we would end up with a much healthier ecosystem if we took a more federated, less centralized approach like the julia optimization community. I'll stop here to keep this post short, but I'm happy to discuss further in a different issue if people are interested.

jeremiahpslewis commented 1 year ago

Hey all! This is a great discussion and I think sometime in the coming week, it would be great to reach a consensus, but for the moment, here are a few things from my side:

  1. Once the RL.jl refactor firms up (and I think we are close), any main branch commit should correspond to the latest versions of each subpackage, so every commit represents a consistent set of packages

  2. The primacy of RL.jl is a bit weird, given that for many use cases, one really doesn't need all of the sub packages. It's really most useful in a quick start context, so maybe we can make this clearer by making RL.jl just another package in the repo instead of the source directory?

  3. Every separate repo increases maintainer load, so I would prefer to first increase number of maintainers and then increase the number of repos and not the other way around.

  4. If/when RL.jl, especially RLCore.jl and RLEnvs.jl reach an adequate level of maturity, we can spin things off at will, I don't see any fundamental objections. ...but at the moment our best way of testing out an idea within a subpackage is to develop it and see whether it works across the different implementations, policies, etc. We would lose this if we spin things off right now.

  5. I think some of the developer gripes (which are legitimate) can be solved with better documentation. This is not one of my personal strengths, but I'm happy to collaborate on this. I've found that copying the CI.yml snippets are sufficient to set up a robust dev environment, I know others have different but also working approaches.

filchristou commented 1 year ago
  1. Once the RL.jl refactor firms up (and I think we are close), any main branch commit should correspond to the latest versions of each subpackage, so every commit represents a consistent set of packages

That is already promising. Do you intend to keep it that way and how will you handle new breaking features per subpackage ?

  1. ...but at the moment our best way of testing out an idea within a subpackage is to develop it and see whether it works across the different implementations, policies, etc. We would lose this if we spin things off right now.

I cannot see how testing across the general registry would be worse here. Or do you mean developing like tweaking code across several subpackages ? If this happens often, then we might need to rethink whether having even seperate subpackages makes sense. The idea of diverse subpackages is modularity, i.e. making change to one of them, without having to change another.

5 ...

Sooner or later I will find some time and I could help here, but first I really need to understand the motives and the dev workflow ^^

@zsunberg I am very happy to read your comment. Coming from the JuliaPOMDP org., do you have any aspirations/expectations from RL.jl that would help both organizations to grow ? Also are there any JuliaPOMDP-JuliaRL interplay you would like to see ?

zsunberg commented 1 year ago

Every separate repo increases maintainer load, so I would prefer to first increase number of maintainers and then increase the number of repos and not the other way around.

I think that most development on this package will be carried out by volunteers, especially grad students (unless some company or very big research org gets behind this), so our development model needs to be tolerant to grad students only being consistent contributors for a few years. To me that means having a very minimal set of core packages that students can build their projects off of and some parts can get thrown away/abandoned without weighing down the entire ecosystem.

@zsunberg I am very happy to read your comment. Coming from the JuliaPOMDP org., do you have any aspirations/expectations from RL.jl that would help both organizations to grow ? Also are there any JuliaPOMDP-JuliaRL interplay you would like to see ?

I mostly would like a system that would make it easy for my students to work on research projects involving RL applied to more realistic applications... I don't have time to write too much at the moment, but so far none of my students have chosen to hook into RL.jl because it is so monolithic and it has been hard for them to take just the parts that they need. (or perhaps they have not been able to understand how to do that).

It would also be great to collaborate on things like common types for action spaces, state spaces etc.

zsunberg commented 1 year ago

Here is perhaps an explanatory example of why my students have not decided to hook into RL.jl: people using RL for research fall into one of two categories: either they have a real-world problem that they want to use RL to solve, or they want to make better RL algorithms and test them against predecessors. It seems like these things are kind of intertwined currently in RL.jl.

HenriDeh commented 1 year ago

The way I see it is that RL.jl should primarily be a repo of working RL algorithms. I think Zoo is the center piece of the ecosystem, the thing most users will come here to use. And Environments to play with them.

The biggest barrier imo really are the incomplete and outdated documentation and the decrepit state of the packages. I recall one issue where someone wanted to run SAC on a simple environment, and the algorithm plain didn't work. He probably moved to stable baselines in python.

filchristou commented 1 year ago

(some thoughts currently on discourse https://discourse.julialang.org/t/large-vs-small-packages/9447/19)

zsunberg commented 1 year ago

The biggest barrier imo really are the incomplete and outdated documentation and the decrepit state of the packages.

Yes, in order to improve this, the community needs to understand what choices led to this current state (at least to some extent) and adopt a strategy that is less prone to it.

In the discourse discussion, it was encouraging to hear cortner's experience that breaking up into smaller packages did improve the ease of collaborators' and students' contributions.