feat: migrate Han's RL-MLE fittings (still working in progress!)

hanhou commented 3 months ago

Steps to refactor:

[x] ~~Migrate Stefano's refactoring~~ Eventually I decided to migrate my code and then refactor based on them
[x] add the simplest demos
[ ] Refactor my code as Stefano did
- [ ] Isolate agent, environment, and tasks
- [ ] More tasks
  - [ ] improve coupled task
  - [x] add the decoupled task
  - [x] add randomwalk
[ ] Add unit test module for para recovery
[ ] Think about adding photostim simulations

hanhou commented 2 months ago

I decided to refactor the code so that agent, environment, and task are fully disentangled.

Here are some ideas,

We code all AIND behavioral tasks (including future VR foraging / force foraging) using gymnasium in a shared library aind-behavior-gym. For dynamic foraging, we can further separate the block structure logic to a separate library called aind-dynamic-foraging-reward-schedule.
Both aind-behavior-gym and the behavioral GUI that trains animal call the same task class in reward-schedule , so that our artificial agents and animals are doing exactly the same task.
On the other hand, we can have different agent libraries that perform the task in the gym. Could be either Po-Chen's meta-rl agents, my Q-learning models, Lukasz's Bayesian agents, or Ulises's RNNs.
If we were to fit the agents to animal behavior, we can write the fitting functions inside the agent libraries since fitting should not rely on the exact task.

@rachelstephlee @ZhixiaoSu @alexpiet

hanhou commented 2 months ago

To prevent this PR from growing too big, I'll go ahead and merge it to develop, and then work on remaining steps.

AllenNeuralDynamics / aind-dynamic-foraging-models