Closed gcroci2 closed 1 month ago
A major goal in neuroscience is to understand the relationship between an animal’s behavior and how this is encoded in the brain. Typical experiment: training an animal to perform a task and recording the activity of its neurons while the animal carries out the task.
To complement these experimental results, researchers “train” artificial neural networks to simulate the same tasks on a computer. Unlike real brains, artificial neural networks provide complete access to the “neural circuits” responsible for a behavior, offering a way to study and manipulate the behavior in the circuit.
You can use:
Song et al.'s networks consisted of two parts:
Other info:
The environment $\epsilon$ represents the experimentalist, while the agent $A$ represents the animal. At each time t the agent chooses to perform actions after observing inputs provided by the environment, and the probability of choosing actions is given by the agent’s policy $\pi_{\theta}$ with parameters $\theta$. Here the policy is implemented as the output of an RNN, so that $\theta$ comprises the connection weights, biases, and initial state of the decision network.
In this work they only consider cases where the agent chooses one out of $Na$ possible actions at each time, so that $\pi{\theta} (at | u{1:t})$ for each t is a discrete, normalized probability distribution over the possible actions $a1, … , a{N_a}$.
After each set of actions by the agent at time t the environment provides a reward (or special observable) $\varrho_{t+1}$ at time t+1, which the agent attempts to maximize.
As a reference for the code, see PR #16
Data in each epoch
trials = task.generate_trials()
is always the same across one network's training (rng
is fixed once for each network), being the entire training set. It's used thousands of times (epochs) to train the net. Are we fine with that? If yes, there is no point in generating the same trials for each epoch every time.What do we think it makes more sense to do here?
Mini Batch vs Batch Gradient Descent
minibatch_size
). Usually, we need many more batches to train during 1 complete epoch. Shouldn't we increase the number of samples used for 1 epoch?nn.MSELoss
. So that’s just one step of gradient descent in one epoch. Is this a correct way of training from a ML point of view?Validation
Any particular reason why validation starts after epoch 200? I think we should properly implement some early stopping callback.
Useful notes Batch size: defines the number of samples to work through before updating the internal model parameters. Number of epochs: defines the number of times that the learning algorithm will work through the entire training dataset. Batch, Mini Batch & Stochastic Gradient Descent
This issue blocks #15
We decided to start with the reinforcement part first, in particular by trying to mimic this article procedure: https://elifesciences.org/articles/21492