This repository is the main source of code related to Kristian Brudeli's master thesis work, written autumn 2022 at the Department of Engineering Cybernetics at NTNU.
The project is about using model-based Reinforcement Learning algorithms to solve a Path-Following and Collision-Avoidance task.
The project builds upon the custom openai/gym
-environment EivMeyer/gym-auv
. The existing work shows promising and exciting results, but we do not know a-priori what the system plans to do. Even the existing systems themselves do not know.
To improve on this, the master thesis will use the papers Dreamer and PlaNet in order to do Model-Based Reinforcement learning. The method learns a latent-space model as described in the PlaNet-paper, which may be used for either MPC-style planning (PlaNet) or for training without access to the environment itself (Dreamer).
In both cases, the method learns a model that should be able to predict future observations
This master thesis work aims to:
Dreamer
-algorithm to work with gym-auv
gym-auv
-environmentCreate a Conda environment:
conda env create --file environment.yml
Activate the environment and run the code.
TODO: Write about adding gym-auv
to .zshrc
etc., i.e.
export PYTHONPATH="${PYTHONPATH}:/PATH/TO/REPOSITORY/master-thesis/gym-auv/"