CNCLgithub / mot

Model implementation for "Adaptive computation as a new mechanism of human attention"
0 stars 0 forks source link

sensitivity with respect to state #38

Closed eivinasbutkus closed 3 years ago

eivinasbutkus commented 4 years ago

I will lay out my approach in estimating sensitivity with respect to state. Importantly, I'm taking state to be the 2D position of the dots, but we can also consider velocity (or both position and velocity by concatenating into a 4D vector).

There are two important differences when estimating sensitivity of state with respect to TD/DC sensitivity. First, state is continuous (unlike TD and DC, which are discrete), so computing KL is complicated. We can solve this by approximating the state distribution using Gaussian and then using analytic KL for Gaussians (https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence#Multivariate_normal_distributions). This takes the assumption that state is roughly gaussian distributed.

Second, we do not have the whole distribution of state within each particle (we do have the whole distribution of TD and DC for some particular state within one particle). This is fine though -- we simply have to estimate the position distribution across particles.

Steps:

  1. Take the weighted particles.
  2. Fit a 2D Gaussian on the weighted samples of positions for each trackers (or 4D Gaussian on positions+velocities?). We can call these positions.
  3. For each tracker, for i=1:att_samples, sample one of the particles using the weights and jitter it using rejuvenation. Replace that sample with the jittered version and recompute 2D gaussian, new_position. Compute kl(positions[tracker], new_position) using the analytic KL for Gaussians.
  4. Finally, we average like we do when estimating TD/DC sensitivity.

At the moment, I'm running into numerical instability when fitting the 2D Gaussian, getting PosDefException: matrix is not positive definite; Cholesky factorization failed.

belledon commented 4 years ago

thanks for spelling this out.

In terms of instability, i'd imagine that most of the unweighted samples are the same and thus you will have very low variance.

One question here. How would sensitivity differ across trackers? one possibility is that this will be proportional to the amount that each tracker has left to converge but i'm not 100%

eivinasbutkus commented 4 years ago

Actually, I was using weighted samples, but you're right that the very low weight of the other samples can make it unstable.

How would sensitivity differ across trackers?

I think you're right saying that it'd be proportional to how much the tracker has left to converge. It'd take the opportunities where the state distribution moves are largest (unsurprisingly, by definition), but not sure where that is exactly. It should definitely be much more uniform (attending to things on their own and so on).