Open SaschaFroelich opened 3 years ago
Actually, the role of 'i' here is only to creat a list: range(i). The number of loops depends then on the number of elements in the list. (e.g. when action = [1,0,0,0], range(i) is empty, which results in no choice; range(1) is [0], which results in one time loop). I know that what I set here is weird, but I don't think it's wrong.
Ah yes, you're right.
1) In
class environment_free_operant(object):
in methoddef obtained_reinforcement(self,t,action,VR,VI,omission):
, the reinforcers for actions are paid out like this:number_of_reinforcer = sum([np.random.choice(range(self.nm-1), p = [1-VR/100, VR/100]) for i in range(np.where(action ==1)[0][0])])
If I interpret the code correctly, then
action
(which is an array of size 4) indicates how often the lever was pressed in a second: 0, 1, 2 or 3 times. But if the lever was pressed once, thenaction = [0, 1, 0, 0]
, andrange(np.where(action ==1)[0][0])
is equivalent torange(0,1)
, which only contains the value0
. Similarly, in the case of 2 or 3 lever presses, the number of presses considered is 1 too few.2) Why is the first argument in
np.random.choice()
range(self.nm-1)
? Shouldn't it berange(2)
since we are always sampling from[0,1]
? it would also throw an error ifself.nm !=
3, sincep
contains only two elements.