Autonomous action selection in the 3 such demos (q-learning-1d / -2d / second-level-motor) is now done with a custom htm-step function. It does the first stage, activation, then selects an action before completing the depolarisation stage with an updated input value. Specifically, only the :action component of the input value is updated. The function also puts the input value for the next timestep onto the world channel.
Accordingly, htm-sense gains a mode argument to restrict the senses to update, to either :sensory or :motor.
Yes it is embarrassing that I didn't do it this way from the beginning.
Autonomous action selection in the 3 such demos (q-learning-1d / -2d / second-level-motor) is now done with a custom htm-step function. It does the first stage, activation, then selects an action before completing the depolarisation stage with an updated input value. Specifically, only the :action component of the input value is updated. The function also puts the input value for the next timestep onto the world channel.
Accordingly, htm-sense gains a mode argument to restrict the senses to update, to either :sensory or :motor.
Yes it is embarrassing that I didn't do it this way from the beginning.