Closed turcato-niccolo closed 1 year ago
Hello @turcato-niccolo ,
Thank you very much for your pull request! Your code is clean and you provided all relevant files for you controller. Regarding the pull request, I have the following remarks and questions:
.gitignore You deleted the paths
leaderboard/simulation/acrobot/data/*
leaderboard/robustness/acrobot/data/*
Please leave them in there (Actually the paths for pendubot are missing). The leaderboard results will be computed with a dedicated workflow and stored in a separate repository dedicated to the leaderboard.
Dockerfile You added the lines
RUN pip3 install numpy dill
RUN python3 -m pip install drake
please put the python dependencies in (src/python/setup.py)[https://github.com/dfki-ric-underactuated-lab/double_pendulum/blob/main/src/python/setup.py]. Did you have problems with numpy or drake? Both are already listed in the setup.py file.
leaderboard data The leaderboard data does not need to be commited. The leaderboard results will be computed with a dedicated workflow and stored in a separate repository dedicated to the leaderboard. Please remove the result files
leaderboard/robustness/acrobot/data/*
leaderboard/simulation/acrobot/data/*
leaderboard/robustness/pendubot/data/*
leaderboard/simulation/pendubot/data/*
from the pull request. I am sorry, that was probably not so clear in the competition/repository rules.
Policy files Consider storing your trained policies, i.e.
leaderboard/robustness/acrobot/policy_mcpilco_nostab.np
leaderboard/robustness/pendubot/policy_mcpilco_full.np
leaderboard/simulation/acrobot/policy_mcpilco_nostab.np
leaderboard/simulation/pendubot/policy_mcpilco_full.np
in the main data folder. I would suggest the structure
data/policies/<design_X.X>/<model_X.X>/<robot>/MC-PILCO/<your_policy>
similar to the stored trajectories in the same data folder. For the naming of design and model see here. <robot>
would be either 'acrobot' or 'pendubot'. If you have multiple policies for the same design, model, robot and method you can create subfolders on the last level.
src/cpp/makefile Did you have problems with the include paths? The path
-I/usr/local/include
should stay there. If you want to add another path, e.g. -I/usr/local/include/eigen3
you can add that after the other include paths. Let me know if that causes a problem for you.
src/cpp/model/src/.idea I am not sure what this folder is needed for. Can you explain the use of this folder? If not needed please delete from the pull request.
Would you be willing to also commit the scripts which you used for training the MC-PILCO policies? Would be nice to have them available for everybody.
Thanks again for the great submission!
Dear @fwiebe,
Thank you so much for the review!
I have addressed your requests:
-I added the leaderboard data paths to gitignore file.
-I moved the needed Python dependencies to setup.py.
-I removed the benchmark data.
-As you suggested, I moved MC-PILCO's policy files to the main data folder with the structure:
data/policies/<design_X.X>/<model_X.X>/<robot>/MC-PILCO/<policy>
-I fixed the includes in the cpp makefile, it compiles in the docker.
-Removed .idea folder. Sorry about that. My editor inserted it and I didn't notice it.
-At the moment it is quite unpractical for us to integrate the training code into this repository. Moreover, MC-PILCO's code is mostly composed of custom libraries, which are still in development.
However, while fixing the code following your directives, I noticed a major bug in the calculate_leaderboard_score.py
scripts (leaderboard folders).
The perf. score's weights in the script are not the same as indicated in: https://ijcai-23.dfki-bremen.de/competitions/ai_olympics/
I didn't notice this before, since I have always used the create_leaderboad.py
script.
I suspect that the mismatch stated in the initial software review was due to this bug. I fixed it in this last commit.
I tested by adding:
COPY . ./double_pendulum/
instead of
RUN git clone https://github.com/dfki-ric-underactuated-lab/double_pendulum.git
in the Dockerfile, to test the local branch. I then restored the Dockerfile before committing changes.
Controllers work as expected and now calculate_leaderboard_score.py
computes the correct scores, which are the ones we stated in the report.
Thank you @turcato-niccolo ,
the PR looks good and I will merge it now. And thanks for pointing out and fixing the wrong weights in the calculate_leaderboard_score.py
scripts. The final ranking will be calculated with the correct weights :)
@turcato-niccolo, @fwiebe I don't see the documentation appearing on this page: https://dfki-ric-underactuated-lab.github.io/double_pendulum/control.html I would highly encourage that a proper documentation of the PILCO controller theory and API is provided so that it is beneficial to the community.
Dear @fwiebe,
Thank you so much for the review!
I have addressed your requests: -I added the leaderboard data paths to gitignore file. -I moved the needed Python dependencies to setup.py. -I removed the benchmark data. -As you suggested, I moved MC-PILCO's policy files to the main data folder with the structure:
data/policies/<design_X.X>/<model_X.X>/<robot>/MC-PILCO/<policy>
-I fixed the includes in the cpp makefile, it compiles in the docker. -Removed .idea folder. Sorry about that. My editor inserted it and I didn't notice it. -At the moment it is quite unpractical for us to integrate the training code into this repository. Moreover, MC-PILCO's code is mostly composed of custom libraries, which are still in development.However, while fixing the code following your directives, I noticed a major bug in the
calculate_leaderboard_score.py
scripts (leaderboard folders). The perf. score's weights in the script are not the same as indicated in: https://ijcai-23.dfki-bremen.de/competitions/ai_olympics/ I didn't notice this before, since I have always used thecreate_leaderboad.py
script. I suspect that the mismatch stated in the initial software review was due to this bug. I fixed it in this last commit.I tested by adding:
COPY . ./double_pendulum/
instead ofRUN git clone https://github.com/dfki-ric-underactuated-lab/double_pendulum.git
in the Dockerfile, to test the local branch. I then restored the Dockerfile before committing changes. Controllers work as expected and nowcalculate_leaderboard_score.py
computes the correct scores, which are the ones we stated in the report.
I forgot to mention that the code for the base version of MC-PILCO is publicly available at: https://www.merl.com/research/license/MC-PILCO
Pull request for the contribution "Athletic Intelligence Olympics challenge with Model-Based Reinforcement Learning". Authors: Alberto Dalla Libera, Niccolo’ Turcato, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres.