Chaotic Exploration and Learning of Locomotion Behaviors

abstract

We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage.

A conceptual description of the chaotic search process is illustrated in Figure 1. The goal of the system can be regarded as finding and becoming entrained in the basin of a particular attractor that has high performance (denoted by C) while escaping from the low-performing attractors (A and B) regardless of the initial point in the state-space. The idea is to open a new pathway that connects those isolated basins through the use of an additional dimension afforded by changing the system dynamics through tuning the chaoticity according to the evaluation signal. The orbit will visit and evaluate each of the attractors (A, B, C) systematically, yet chaotically, by adaptively varying the bifurcation parameter of the system according to a feedback signal until it reaches the basin of the desired attractor. The process can be interpreted as a continuous and deterministic version of trial- and-error search that exploits the intrinsic chaotic behavior of the system.

Screenshot from 2020-10-11 19-44-22

mxochicale / exploring

Chaotic Exploration and Learning of Locomotion Behaviors #79

Chaotic Exploration and Learning of Locomotion Behaviors

abstract

authos

links