Meng-Ling-Ori / Value_free

0 stars 0 forks source link

ToDo: Leisure reinforcer in simulation 4 #2

Open SaschaFroelich opened 3 years ago

SaschaFroelich commented 3 years ago

I think that also in the free operant cases of simulation 4, the agent is supposed to receive a leisure reinforcer when they don't press. Miller states on page 5:

"This value (U(m)) is typically unity for reinforcers dessignated 'food pellets', 0.1 for reinforcers designated 'leisure', and -1 for reinforcers designated 'effort', unless otherwise noted".

Then later on p. 10, in section Omission and Devaluation in VR vs. VI schedules they say "The magnitude of the Leisure reinforcer (reflecting effort cost) was not changed".

This is a bit ambiguous (why does leisure reinforcer reflect effort cost?), but we should try giving the agent a leisure reinforcer with utility 0.1 for each second they don't press the lever.

Meng-Ling-Ori commented 3 years ago

I have found this contradiction, and try to let nm = 2 (only pellets and effort), but the results were not good. I will try to add the utility for 'leisure'. However, I already thought a lot about it before, and I found it was difficult to do so, because it's hard to say, how many number of 'not press' occur in a second with press 0 ,1 2, 3 or more(with Poisson)? I will try to let only 0 press in a second result in a 'leisure' and see what happen.

SaschaFroelich commented 3 years ago

Yes I think that would be the most reasonable way to go about it.

Meng-Ling-Ori commented 3 years ago

I have tried, but nothing changed. The new code has been uploaded. I will do more try! Or at least I will try to simulate the last framework: two armedbandid.