This repository contains the instructions on how to reproduce the experiments from the paper "Neuroevolution of Recurrent Architectures on Control Tasks" published at GECCO 2022 & the ALOE ICLR 2022 Workshop.
The code itself is located in a larger library called Nevo.
# Debian packages ----------mpi4py---------- ~~~~~~~~~~~~~~~~~Gym~~~~~~~~~~~~~~~~~~~
sudo apt install git python3-virtualenv python3-dev libopenmpi-dev g++ swig libosmesa6-dev patchelf ffmpeg
# MuJoCo
wget -P ~/Downloads/
mkdir -p ~/.mujoco/ && tar -zxf ~/Downloads/mujoco210-linux-x86_64.tar.gz -C ~/.mujoco/
echo -e "\n# MuJoCo\nMUJOCO_PATH=~/.mujoco/mujoco210/bin
source ~/.bashrc
# Library & Dependencies
git clone && cd nevo/
virtualenv venv && source venv/bin/activate && pip3 install -r requirements.txt
= acrobot
, cart_pole
, mountain_car
, mountain_car_continuous
, pendulum
, bipedal_walker
, bipedal_walker_hardcore
, lunar_lander
, lunar_lander_continuous
, ant
, half_cheetah
, hopper
, humanoid
, humanoid_standup
, inverted_double_pendulum
, inverted_pendulum
, reacher
, swimmer
or walker_2d
= static/rnn
or dynamic/rnn
mpiexec -n <n> python3 --env_path envs/multistep/score/ \
--bots_path bots/network/<net>/ \
--nb_generations <nb_generations> \
--population_size <population_size> \
--additional_arguments '{"task" : "<task>"}'
Example from the paper: (you can increase the number of MPI processes if your machine allows it)
mpiexec -n 2 python3 --env_path envs/multistep/score/ \
--bots_path bots/network/dynamic/rnn/ \
--nb_generations 100 \
--population_size 16 \
--additional_arguments '{"task" : "acrobot"}'
wget -P ~/Downloads/
unzip -o ~/Downloads/ -d data/states/
You can now run additional generations ...
mpiexec -n 4 python3 --env_path envs/multistep/score/ \
--bots_path bots/network/dynamic/rnn/ \
--nb_elapsed_generations 100 \
--nb_generations 10 \
--population_size 16 \
--additional_arguments '{"task" : "cart_pole"}'
... Evaluate the new agents ...
mpiexec -n 4 python3 utils/ --states_path data/states/envs.multistep.score.control/
... And both record the elite's behaviour and obtain its architecture.
python3 utils/ --state_path data/states/envs.multistep.score.control/
# Verify Stable Baselines 3 Results (not necessary for later steps)
git clone --recursive ~/rl-baselines3-zoo
jupyter notebook utils/notebooks/neuroevolution-recurrent-architectures/sb3_baselines.ipynb
# Visualize the dynamic networks
jupyter notebook utils/notebooks/neuroevolution-recurrent-architectures/dynamic_rnn.ipynb
# Reproduce the paper's figures
jupyter notebook utils/notebooks/neuroevolution-recurrent-architectures/figures.ipynb