Sampling bias in gmm_progress

Because of the boundary constraints on the motor/sensory dimension, the sampling of a new motor/sensory command/goal is biased with increase probability to sample a point at the boundary.

This is due to line https://github.com/flowersteam/explauto/blob/master/explauto/interest_model/gmm_progress.py#L34L35 where the sampled point is constrained to be within the boudaries. A quick visualization of the problem is below.

2017-01-04 15 24 45

It leads to the following patterns of goal selection in a 7 arm experiments. The right plot show in red the goal selected, we clearly see an oversampling on the boundaries, here [-1, 1] in each dimension.

0007

The simplest solution to unbias this sampling step is to keep sampling until the point sampled is within the boundary. The resulting resampling strategy looks much better/less biased. It was run with the same seed, the effect is the same for many seeds.

0007

This has been implemented in pull request #79 by:

adding a resample_if_out_of_bounds arguments for the GmmInterest class, defaulting to False for consistency with previous version. I suggest to turn it True by default if you agree.
updating the sample function accordingly
adding a is_within_bounds function in utils/utils.py

flowersteam / explauto

Sampling bias in gmm_progress #80