AllenNeuralDynamics / aind-dynamic-foraging-models

behavioral models for the dynamic foraging task
MIT License
0 stars 0 forks source link

RL model fitting pipeline #33

Open hanhou opened 1 week ago

hanhou commented 1 week ago

Now that RL MLE model fitting library is ready, I'd like to use it as the first MVP of our new analysis pipeline (doc here). I'm going to try what Jake propsed (see below).

Related issues

Steps:


I changed my mind. Expecting that there might be many roadblocks messing with CO, I chose to do some fast prototyping with our initial thoughts: building our own hashing and job dispatching machanism.

Version control

Inputs:

Outputs:

Policy:

hanhou commented 1 week ago

From David:

I discussed our options with Jake from CO today. He proposed a variation on our architecture:

  1. A script/capsule that queries for sessions of interest, creates a combined data asset, attaches it to the pipeline, then triggers the analysis pipeline and waits.
  2. Analysis pipeline runs. Nextflow's cache will not re-run any jobs that have already been run on data within that combined data asset, which is the caching behavior we want.
  3. capsule in (1) captures a data asset per subfolder in the results, creates a combined data asset out of it, then deletes the run.

The only hard blocker on this now is that the nextflow cache files currently expire after 30 days. Jake is looking to see if we can configure a custom location for the cache for this pipeline.

Another issue is that combined data only work with external asset right now. I was already planning to start capturing processed outputs to s3://aind-open-data anyway, so soon this will not be an issue. And combined data will support internal assets in ~6m anyway.

Another issue is that combined data only work with external asset right now.

Should not be an issue because we'll get nwbs from s3 anyway (s3://aind-behavior-data/foraging_nwb_bonsai/).