A next-level PVLV / boa step is to start adapting / learning key params such as:
Expected effort -- this can depend on the context and controls when to give up -- NE neuromodulator widely thought to be involved in regulating this (implicated in ADHD, etc). These are currently set to fixed values in PVLV.Effort.Max* params.
Expected reward magnitude -- scaling of DA responses as a function of overall expected rewards is well documented (e.g., 2 drops of juice in context of 1-2 range is max DA, but in context of 2-4 range is reduced). This is a separable factor from VSPatch prediction of timing and magnitude for an individual reward -- depends on overall context (Niv et al have studied). These are currently set in PVLV.USs.Gain* and VTA params.
Mechanically, just need to put params somewhere appropriate -- Globals if NData specific, or some layer's specific params, as in the case of VTALayer, and then critically save and load these adapting values with weights file.
A next-level PVLV / boa step is to start adapting / learning key params such as:
Expected effort -- this can depend on the context and controls when to give up -- NE neuromodulator widely thought to be involved in regulating this (implicated in ADHD, etc). These are currently set to fixed values in
PVLV.Effort.Max*
params.Expected reward magnitude -- scaling of DA responses as a function of overall expected rewards is well documented (e.g., 2 drops of juice in context of 1-2 range is max DA, but in context of 2-4 range is reduced). This is a separable factor from VSPatch prediction of timing and magnitude for an individual reward -- depends on overall context (Niv et al have studied). These are currently set in
PVLV.USs.Gain*
andVTA
params.Mechanically, just need to put params somewhere appropriate -- Globals if NData specific, or some layer's specific params, as in the case of VTALayer, and then critically save and load these adapting values with weights file.