andredelft / flock-learning

A model exploring collective motion using reinforcement learning with orientation-based rewards (MSc thesis).
MIT License
2 stars 0 forks source link

Flock-learning

This repository contains the model that I wrote for the research project for my MSc in theoretical physics. The model aims to describe collective motion using Q-learning with orientation-based rewards.

In this documentation, I will explain the technical details of the model. The conceptual framework is written down in my thesis, which can be found here.

Getting started

First setup your (virtual) environment:

$ pip install -r requirements.txt

Simulations can then be performed in a couple of different ways:

  1. Start a regular simulation directly by creating an instance of the Field class. An integer should be passed that specifies the number of birds.

    >>> from field import Field
    >>> Field(100)

    Two other ways in which a Field instance can run is by recording a movie:

    >>> Field(100, record_mov = True, sim_length = 5000)

    or just recording the data:

    >>> Field(100, record_data = True, plot = False, sim_length = 5000)

    The quantities that are recorded can be further specified using the record_quantities keyword.

  2. Start a simulation with fixed Q-tables from a given file (the output of a Q-learning training phase with record_data = True)

    >>> from main import load_from_Q
    >>> load_from_Q(fpath = 'data/20200531/2-VI/20200531-100329-Q.npy')
  3. Start a simulation with fixed Q-values from a specific value of Delta

    >>> from main import load_from_Delta
    >>> load_from_Delta(0.2)
  4. Start multiple simulations at the same time using multiprocessing, with the option of adjusting parameters in different runs:

    >>> pars = [
    ...     {'observation_radius': value}
    ...     for value in [10, 50, 100, 150]
    ... ]
    >>> run_parallel(pars, sim_length = 10_000, comment = 'vary_obs_rad')

    NB: There is a complication regarding multiprocessing and the python random module, which sometimes results in very similar initializations. This problem has not been solved yet. Cf. this blog post.

  5. If the Q-tables of a given simulation are saved regularly (using the option Q_every), these are saved in some directory, different (short) simulations can be performed for every saved Q-table. The graphs on the right hand side of this and this figure have been generated using this option.

    >>> data_dir = 'path/to/some/dir'
    >>> run_Q_dirs(data_dir)
  6. Run a benchmark test and create a figure with the results

    >>> from main import benchmark
    >>> benchmark()
    >>> p.plot_all(
    ...     quantity = 't', save_as = 'benchmark.png',
    ...     title = 'Benchmark test', legend = 8
    ... )

All options for the Field and Bird classes

field.Field(numbirds, sim_length = 12500, record_mov = False, record_data = False, record_time = False, record_quantities = [], field_dims = FIELD_DIMS, periodic = True, plotscale = PLOTSCALE, plot = True, comment = '', Q_every = 0, repos_every = 0, record_every = 500, **kwargs)

All other options will be passed to the birds.Birds instance.

birds.Birds(numbirds, field_dims, action_space = A, observation_space = O, leader_frac = 0.25, reward_signal = R, learning_alg = 'Q', alpha = alpha, gamma = gamma, epsilon = epsilon, Q_file = '', Q_tables = None, gradient_reward = True, observation_radius = d, instincts = [], eps_decr = 0)

numbirds and field_dims will be inherited from the Field class. All other options are: