dfki-ric-underactuated-lab / double_pendulum

Dual purpose Acrobot and Pendubot Platform
BSD 3-Clause "New" or "Revised" License
33 stars 22 forks source link

Final submission ijcai 23 dfki bremen ai olympics competition #3

Closed turcato-niccolo closed 1 year ago

turcato-niccolo commented 1 year ago

Pull request for the contribution "Athletic Intelligence Olympics challenge with Model-Based Reinforcement Learning". Authors: Alberto Dalla Libera, Niccolo’ Turcato, Giulio Giacomuzzo, Ruggero Carli and Diego Romeres.

fwiebe commented 1 year ago

Hello @turcato-niccolo ,

Thank you very much for your pull request! Your code is clean and you provided all relevant files for you controller. Regarding the pull request, I have the following remarks and questions:

  1. .gitignore You deleted the paths

    leaderboard/simulation/acrobot/data/*
    leaderboard/robustness/acrobot/data/*

    Please leave them in there (Actually the paths for pendubot are missing). The leaderboard results will be computed with a dedicated workflow and stored in a separate repository dedicated to the leaderboard.

  2. Dockerfile You added the lines

    RUN pip3 install numpy dill
    RUN python3 -m pip install drake

    please put the python dependencies in (src/python/setup.py)[https://github.com/dfki-ric-underactuated-lab/double_pendulum/blob/main/src/python/setup.py]. Did you have problems with numpy or drake? Both are already listed in the setup.py file.

  3. leaderboard data The leaderboard data does not need to be commited. The leaderboard results will be computed with a dedicated workflow and stored in a separate repository dedicated to the leaderboard. Please remove the result files

    leaderboard/robustness/acrobot/data/*
    leaderboard/simulation/acrobot/data/*
    leaderboard/robustness/pendubot/data/*
    leaderboard/simulation/pendubot/data/*

    from the pull request. I am sorry, that was probably not so clear in the competition/repository rules.

  4. Policy files Consider storing your trained policies, i.e.

    leaderboard/robustness/acrobot/policy_mcpilco_nostab.np
    leaderboard/robustness/pendubot/policy_mcpilco_full.np
    leaderboard/simulation/acrobot/policy_mcpilco_nostab.np
    leaderboard/simulation/pendubot/policy_mcpilco_full.np

    in the main data folder. I would suggest the structure

    data/policies/<design_X.X>/<model_X.X>/<robot>/MC-PILCO/<your_policy>

    similar to the stored trajectories in the same data folder. For the naming of design and model see here. <robot> would be either 'acrobot' or 'pendubot'. If you have multiple policies for the same design, model, robot and method you can create subfolders on the last level.

  5. src/cpp/makefile Did you have problems with the include paths? The path

    -I/usr/local/include

    should stay there. If you want to add another path, e.g. -I/usr/local/include/eigen3 you can add that after the other include paths. Let me know if that causes a problem for you.

  6. src/cpp/model/src/.idea I am not sure what this folder is needed for. Can you explain the use of this folder? If not needed please delete from the pull request.

  7. Would you be willing to also commit the scripts which you used for training the MC-PILCO policies? Would be nice to have them available for everybody.

Thanks again for the great submission!

turcato-niccolo commented 1 year ago

Dear @fwiebe,

Thank you so much for the review!

I have addressed your requests: -I added the leaderboard data paths to gitignore file. -I moved the needed Python dependencies to setup.py. -I removed the benchmark data. -As you suggested, I moved MC-PILCO's policy files to the main data folder with the structure: data/policies/<design_X.X>/<model_X.X>/<robot>/MC-PILCO/<policy> -I fixed the includes in the cpp makefile, it compiles in the docker. -Removed .idea folder. Sorry about that. My editor inserted it and I didn't notice it. -At the moment it is quite unpractical for us to integrate the training code into this repository. Moreover, MC-PILCO's code is mostly composed of custom libraries, which are still in development.

However, while fixing the code following your directives, I noticed a major bug in the calculate_leaderboard_score.py scripts (leaderboard folders). The perf. score's weights in the script are not the same as indicated in: https://ijcai-23.dfki-bremen.de/competitions/ai_olympics/ I didn't notice this before, since I have always used the create_leaderboad.py script. I suspect that the mismatch stated in the initial software review was due to this bug. I fixed it in this last commit.

I tested by adding: COPY . ./double_pendulum/ instead of RUN git clone https://github.com/dfki-ric-underactuated-lab/double_pendulum.git in the Dockerfile, to test the local branch. I then restored the Dockerfile before committing changes. Controllers work as expected and now calculate_leaderboard_score.py computes the correct scores, which are the ones we stated in the report.

fwiebe commented 1 year ago

Thank you @turcato-niccolo ,

the PR looks good and I will merge it now. And thanks for pointing out and fixing the wrong weights in the calculate_leaderboard_score.py scripts. The final ranking will be calculated with the correct weights :)

shivesh1210 commented 1 year ago

@turcato-niccolo, @fwiebe I don't see the documentation appearing on this page: https://dfki-ric-underactuated-lab.github.io/double_pendulum/control.html I would highly encourage that a proper documentation of the PILCO controller theory and API is provided so that it is beneficial to the community.

turcato-niccolo commented 1 year ago

Dear @fwiebe,

Thank you so much for the review!

I have addressed your requests: -I added the leaderboard data paths to gitignore file. -I moved the needed Python dependencies to setup.py. -I removed the benchmark data. -As you suggested, I moved MC-PILCO's policy files to the main data folder with the structure: data/policies/<design_X.X>/<model_X.X>/<robot>/MC-PILCO/<policy> -I fixed the includes in the cpp makefile, it compiles in the docker. -Removed .idea folder. Sorry about that. My editor inserted it and I didn't notice it. -At the moment it is quite unpractical for us to integrate the training code into this repository. Moreover, MC-PILCO's code is mostly composed of custom libraries, which are still in development.

However, while fixing the code following your directives, I noticed a major bug in the calculate_leaderboard_score.py scripts (leaderboard folders). The perf. score's weights in the script are not the same as indicated in: https://ijcai-23.dfki-bremen.de/competitions/ai_olympics/ I didn't notice this before, since I have always used the create_leaderboad.py script. I suspect that the mismatch stated in the initial software review was due to this bug. I fixed it in this last commit.

I tested by adding: COPY . ./double_pendulum/ instead of RUN git clone https://github.com/dfki-ric-underactuated-lab/double_pendulum.git in the Dockerfile, to test the local branch. I then restored the Dockerfile before committing changes. Controllers work as expected and now calculate_leaderboard_score.py computes the correct scores, which are the ones we stated in the report.

I forgot to mention that the code for the base version of MC-PILCO is publicly available at: https://www.merl.com/research/license/MC-PILCO