Closed DaniBodor closed 2 months ago
- 8d11a5a: I believe that
len(output_behavior)
needs to matchn_outputs
, so I made the code read that length rather than having to define it by the user.
The reason for this was that in principle the output could have had more values, and this was not necessarily related to the number of outputs. But for how we are dealing with it now (i.e., using only one output signal and assigning it a number of values equals to the number of possible actions), I'd say your change is correct. Let's keep it as you did for now.
- d358204: In other envs (including the template one), the "choice" options of the
action_space
do not include the 0 option. Is there a reason this was different forAnnubesEnv
?
Good point, I remember I was also confused by that, not sure why at the end I left it this way. Yes your changes are good here, and also plotting the trials everything seem to work.
I made 2 changes to
AnnubesEnv
, both of which I am not 100% certain are correct. Please let me know if I am mistaken and I can drop the change.len(output_behavior)
needs to matchn_outputs
, so I made the code read that length rather than having to define it by the user.action_space
do not include the 0 option. Is there a reason this was different forAnnubesEnv
?