maciejors commented 11 months ago

The problem

While writing configurations for experiments I realised that there is a lot of repetitiveness involved. For example:

Tasks inside one curriculum often have very similar configurations,
Today I had to split DQN and PPO configs due to one stop condition being different plus one extra parameter for DQN.

While the second point is quite specific, the first one is quite obvious and common as tasks in a single curriculum usually differ only on difficulty level and potentailly also stop conditions.

What do we have

Currently creating configs isn't much different from configuring inside a script. The only mechanism that gives some reusability to config files (and therefore gives them an edge over scripts) is the path property for tasks in curriculum, which allows to reuse a single task configuration across multiple curricula. Hovewer truth be told it's not the most common use case (it could be used for curriculum vs no curriculum IF it allowed parameter overriding since lvl2 tasks differ on stop conditions there)

Considering that it's harder to configure using files that in a script (extra files + no intellisense), it's no surprise that most of you opted for the latter. Configuration files feel redundant (other than perhaps being useful for people that are not confident in coding) and this needs to change.

What could be added

I got a couple of ideas which would make configurations files actually useful. Implementation details are not relevant rn but I do have some ideas on how to implement these without making config files too complex.

Common properties in curriculum

The most useful one. This would allow to define common task properties inside curriculum configs as a separate property. All listed tasks would then only define properties that are specific to them.

Task overriding

This would allow task configuration to refer to another task configuration file in a similar way that is currently possible with path property on curriculum tasks.

Variables in configs

This would probably be the most tricky to implement (however I do have an idea how to go about it). It would allow to somehow mark property value as a variable which could then be substituted when loading a config. It could be useful in a numerous ways, for example when we want to provide different random states for an environment when loading the same configuration multiple times, or perhaps dynamically pass a value for a certain stop condition depending on a result of another task.

This would hovewer most likely require a change to saving/loading API for LearningTask/Curriculum. While this might seem like too much at first glance, some time ago I actually had in mind that saving/loading task/curriculum configs shouln't be done through SavableLoadable interface. That is because in the case of these classes these methods might be confusing because they save configs and not states (which is the case for all other classes that implement this interface). Therefore I thought that maybe these should not implement SavableLoadable and also save/load methods should be then renamed to something like save_config/load_config to emphasise that they're differnet from save/load from other classes.

When should this be added

This issue is similar to #21 in a way that I don't think there's a need to rush it before the thesis deadline. While these changes could be useful for experiments, the truth is we're already halfway through when it comes to creating experiments configurations. Considering that this needs time to implement, it most likely would be ready after all experiments have been set up (or at best at the very end which quite frankly wouldn't be particularly useful). That being said, I think there's no need to rush these features, although they should be one of the priorities once we're done with the first release.

krezelj commented 11 months ago

You've really hit the nail on the head with this issue. I feel exactly the same way. At the beginning the configuration files in my head were those simple, easy to use things which would speed up task creation process. However as you've pointed it out it turned out that configuring tasks inside the code is as easy if not easier since you don't even have to load the config file.

This really bothered me to the point where a few weeks ago I was thinking about suggesting a different approach to config files which would involve more natural language and thus be easier write and to use by non coders (altough I'm not sure if this is our target audience anyway).

Either way, all 3 things you suggested sound great. Not only are they not really complicated to implement but I feel each and every one of them would make it much easier to write configs.

I do agree that right now is not the best time to implement it but I would put it as high priority after the first release.

To be honest I don't know if there is anything else to add to your idea. I agree with all you've said 100% and I'm glad you brought it up.

Maybe as a food for thought I'll give an example how I imagined the natural language based configs. I've deleted the original example so I'll try to recreate it from memory and it's obviously not perfect or well though out so treat it as a v0.0 prototype.

agent: PPO
env: DoorKey
goal: reach score 0.8 at difficulty 2
curriculum:
    task1: reach score 0.6 at difficulty 0
    task2: reach 5000 episodes at dfficulty 1
    goal
save_path: ...

TBH It looked better before I promise, I don't know why I haven't saved the prototype.