Sensor Fusion between IMU and optical tracker

cboulay commented 8 years ago

I'm just dropping a couple links to papers here for now.

HipsterSloth commented 8 years ago

I wonder if the quaternion averaging method I linked lead to a degenerate quaternion in certain cases? This quote from the Baldassano thesis (page 6) was interesting:

Due to numerical round-off errors, it is possible for the state covariance matrix to cease to be positive definite, causing the UKF algorithm to fail when taking a square root of the covariance to calculate the sigma points. To prevent this possibility, the square root of the covariance matrix can calculated directly during every time step, without using the actual covariance matrix

Is the python kalman filter library re-calculating the covariance every step?

Maybe would could store the rotation in the state as an angle axis representation (length of the vector is the rotation angle)? That would at least simplify the angle wrapping.

HipsterSloth commented 8 years ago

I just updated the C++ KalmanPoseFilter to bring it mostly in line with the python one: f405a3fa0f156ee10739323a4d8b4a005778519f

Removed control Vector
Removed angular acceleration and gyro bias from the state
Updated SystemModel model to advance the state physics correctly
Updated the PSMove and DS4 Measurement Models to properly make predictions using the sensor measurements
Modified filter update() calls to set position and orientation in the state on the first call after a reset.

I still need to initialize the process and measurement noises to match the way you are doing it. I'll work on that tomorrow. I did notice two things while porting over the code:

1) It looks like you are missing a unit conversion on the accelerometer data here. I assume the acceleration in the state is stored in cm/s^2, but the accelerometer measurement is in g-units (1g = 9.8m/s^2). 2) mherb/Kalman doesn't appear to allow you to compute overrides for mean and state differences (that I could see). Is that not needed for the SRUKF? Or are we going to have to figure out a way to override some function in the SquareRootUnscentedKalmanFilter to make that work?

cboulay commented 8 years ago

the square root of the covariance matrix can calculated directly during every time step

This is exactly what the square-root version of the UKF takes care of. Unfortunately the Python library I'm using doesn't do square-root UKF, so that might be part of the problem. The C++ library does.

Maybe would could store the rotation in the state as an angle axis representation (length of the vector is the rotation angle)? That would at least simplify the angle wrapping.

That would solve the issue with the state vector having 16 elements but the state covariance matrix having 15. Can the angle-axis representations be added/subtracted with normal elementwise +/-? If not, we'd still have to roll our own sigma-point calculation, mean calculation, difference calculation, etc.

Updated the PSMove and DS4 Measurement Models to properly make predictions using the sensor measurements

Just to be clear, you should be doing the opposite: using state variables to predict sensor measurements. These predictions are then used to calculate the Kalman gain, compared with the true measurements to get the innovation which is used to update the state (and state covariance).

missing a unit conversion on the accelerometer data

Yes, you're right, thanks.

mherb/Kalman doesn't appear to allow you to compute overrides for mean and state differences (that I could see). Is that not needed for the SRUKF? Or are we going to have to figure out a way to override some function in the SquareRootUnscentedKalmanFilter to make that work?

The overrides (+ sigma-point calculation which I'm not yet doing in Python) are needed because addition and subtraction of state variables doesn't work with quaternions or angles using the simple + and - operators. You might be able to overload the operators, but I think this is more inefficient than overriding the functions.

Problematic parts:

sigma point calculation
compute new state mean from propagated sigma points. We only want to modify it for the state mean, not the measurement mean.
Probably others.

I biked home in the rain so I kept my laptop at work. I'm using my wife's laptop and it doesn't have the tools to be effective so I'll pack it in for the night.

HipsterSloth commented 8 years ago

Just to be clear, you should be doing the opposite: using state variables to predict sensor measurements. These predictions are then used to calculate the Kalman gain, compared with the true measurements to get the innovation which is used to update the state (and state covariance).

Ah sorry that comment was a typo on my part. I'm doing exactly that.

Can the angle-axis representations be added/subtracted with normal elementwise +/-?

I thought it could, but after some reading it turns out I was mistaken. You have to convert angle-axis rotations to some other representation to concatenate them :(. So it looks like we're stuck with either euler angles or have the overrides.

Problematic parts: sigma point calculation. compute new state mean from propagated sigma points. We only want to modify it for the state mean, not the measurement mean.

Thanks for the pointers there. After I get the process and measurement noise matrices filled in tomorrow I'll look into how to best override those parts (assuming we still want to stick with having a quaternion in that the state).

cboulay commented 8 years ago

I'm sure you saw it because I mentioned you by name in the issue, but I opened one up here. I think the correct solution would be similar to the example I gave, and to do so not just in the SRUKF but for all UKFs.

I sometimes confuse the covariance matrices. Let's include the capital letter to denote the kind of covariance matrix we're talking about. P is the state covariance (intrinsically updated on every tick), Q is the process noise covariance, and R is the measurement noise covariance. (I hope that's correct...)

To fill in Q, there's that matrix you pasted above, but that comes from a general equation found here

where DELTA is the time interval and N is the order of the process model.

For the linear space, our process model is 3rd order. For the angular space, our process model is second order. That can be used to fill in Q for position, vel, accel, ang_vel, but not for orientation if we use a quaternion in our process noise. Alternatively, our process noise can use the angle-axis representation and then we can use the above equation in Q. As a result, all of the processing steps that make use of Q (and, by extension, P because P depends on Q) will have to be customized to pull out the angle-axis representation of orientation and treat it properly. Edit: But we have to customize those functions anyway even if using quaternions in Q (and P), so why not use the angle-axis representation so we have a physically-based way to estimate the entries in Q.

HipsterSloth commented 8 years ago

I recall seeing a similar expression in Enayati's thesis:

Though at the time I didn't realize Enayati was talking about the process matrix Q. Thanks for clarifying that.

It looks like the mherb/kalman library expects the Q covariance matrix to have the same number of rows/columns as the state has rows:

    /// Where Type = StateVector<float>
    template<class Type>
    using Covariance = SquareMatrix<typename Type::Scalar, Type::RowsAtCompileTime>;

Given that and the fact that we're going to have to override the sigma point calculation and state mean anyway regardless of whether the state contains a quaternion or an angle-axis vector I think that forces us into putting a 3-variable orientation representation into the state. Unless you have any objections I'm going to change the rotation state to angle axis.

cboulay commented 8 years ago

Sure. If you want to go down a tangential rabbit hole, a 3D representation of orientation called "Modified Rodrigues Parameters" has established equations of motion that can be used in the process model. These seems complicated to me, but might be less complicated than doing 3D to quat, quat multiplication, quat to 3D.

HipsterSloth commented 8 years ago

Quick update on noise parameter stuff. I just checked in fdf66fd3cfb7fc47b0a03395d47a58842fab370e:

Reordered state to [pos_x, vel_x, acc_x, pos_y, ..., ang_x, omega_x, ang_y, ...]
SystemModel now fills in process covariance
- Added Q_discrete_3rd_order_white_noise to initialize linear noise parameters, adapted from filterpy
- Added Q_discrete_2rd_order_white_noise to initialize rotational noise parameters
- Added update_process_covariance() to update Q for changes in optical tracking quality
MeasurementModels (PSMove and DS4) now fills in measurement covariance
- init() sets the initial covariances based on constants (still need to fill these in)
- update_measurement_covariance() sets covariances based on constants and optical tracking quality
Changed UKF kappa constants to 0 since filter is 3D
- The Kalman filter Jupyter notes implied that kappa should be 0 if the dimension of the filter is 3. Am I understanding that correctly?
Added new entries to PositionFilterConstants (still need to fill these in)
- mean_update_time_delta
- min_position_variance
- max_position_variance
Added new entries in OrientationFilterConstants (still need to fill these in)
- mean_update_time_delta
- min_orientation_variance
- max_orientation_variance
- magnetometer_variance

Sure. If you want to go down a tangential rabbit hole, a 3D representation of orientation called "Modified Rodrigues Parameters" has established equations of motion that can be used in the process model. These seems complicated to me, but might be less complicated than doing 3D to quat, quat multiplication, quat to 3D.

I did a quick google search for this and found a few papers about using Modified Rodrigues Parameters in the state and quickly got in over my head. I'm going to start the mherb/Kalman override work now. If that starts to get too complicated I'll double back to this other approach. I'm hoping that the angle wrapping intrinsic to angle-axis isn't going to be a problem. /crosses-fingers

EDIT: fixed broken jupyter link

cboulay commented 8 years ago

The Kalman filter Jupyter notes implied that kappa should be 0 if the dimension of the filter is 3. Am I understanding that correctly?

In that sentence, dimensionality doesn't refer to the dimensionality of the space, but the length of the state vector. In the first UKF radar example (code box [18]) they're using 3 state variables (i.e., state has a dimension of 3): x, dx, and y, so kappa = 0. A little further down they add another variable dy (state has dimension 4) so kappa = -1 (box [19]).

Our state vector has dimension = 15, so kappa = -12. I'm not sure if this rule holds when you get to this many state variables.

I converted my code (not pushed yet) to using 16d state vector and 15d process noise. Even though filterpy allows for custom arithmetic steps on state-sigmapoint and state-covariancecolumn, there are parts that are hard-coded to expect the same dimensionality in the state variable and the covariance columns. I didn't finish working around those before I left work today. The reason I'm trying it this way instead of use angle-axis representation is because I know how to do quaternion addition/subtraction, and I think I have a handle on doing quaternion averaging. I don't know how to do these operations with angle-axis representations. Do you know how you're going to do that, or are you just going to convert them back to quats and do the arithmetic there?

Just in case it ends up being useful, filterpy has an example for averaging angles (only look at the index-2 variable) by (weighted) accumulation of sin and cos of the angle, then taking the atan2 of those. Below that they also have an example for subtracting angles, though there's a bug that is undetected because it's just a code-comment. It should be y += 2*np.pi

HipsterSloth commented 8 years ago

In that sentence, dimensionality doesn't refer to the dimensionality of the space, but the length of the state vector.

Whoops! In that case I should probably stick with the -1 value you used in your implementation.

Do you know how you're going to do that, or are you just going to convert them back to quats and do the arithmetic there?

I was just planning on converting them back to quaternions for the purpose of averaging, then back to an angle axis for storing back into the result. I don't know of a way to average together angle axis vectors short of converting them back to quaternions.

Just in case it ends up being useful, filterpy has an example for averaging angles (only look at the index-2 variable) by (weighted) accumulation of sin and cos of the angle, then taking the atan2 of those

I was actually just looking at that. Did I mention that the Jupyter page is really well done? What a great tutorial.

On the bus ride home I was thinking if there were other simple rotation representations that would minimize weird discontinuities, had simpler math, had easily inspect-able state and were more amenable to simple addition/subtraction.

Crazy idea: We could represent the rotation state as a 3d vector on a sphere (encodes yaw&pitch) + a roll angle i.e. [rx, ry, rz, roll] where rx^2 + ry^ + rz^2 = 1 and -pi < roll < pi. If the sigma points for the r-vector were within the same hemisphere averaging the r-vectors and then renormalizing gives a reasonably averages yaw-pitch. The average roll vector would use the atan2 trick you just mentioned.

HipsterSloth commented 8 years ago

Checked in an attempt to override the SRUKF: fa8d4812895091e77fbb3379c51ca09995534b13

There is now a CustomSRUFK that derives from Kalman::SquareRootUnscentedKalmanFilter:

Override for computeSigmaPoints() to perform custom application of state deltas
Added computeStatePredictionFromSigmaPoints() and computeMeasurementPredictionFromSigmaPoints() to compute custom averaging of sigma points.
- linear states are averages together using simple matrix multiplication with the weight vector
- orientations get extracted to quaternions, averaged, then stored back as an angle-axis state

I also added StateVector::addStateDelta() which applies a state delta to a state vector (replaces simple addition/subtraction). All of the linear components just get a simple linear addition. The orientation gets multiplied by the orientation delta quaternion.

Going to start testing this tomorrow.

HipsterSloth commented 8 years ago

D'oh! I missed a bunch of places for computing custom differences for state and measurement:

In computeCovarianceSquareRootFromSigmaPoints here and here
In computeKalmanGain here and here
And in update here

We might be able to accomplish this with operator overloading but I'd rather be explicit since this operation isn't really a pure subtraction. At this point we're almost completely overriding the SquareRootUnscentedKalmanFilter class, so I think I'm just going to make the CustomSRUFK derive from the lower UnscentedKalmanFilterBase and be it's own parallel implementation.

EDIT: typos

cboulay commented 8 years ago

I just decided on the same thing in Python. I made all the changes, I think, but now the matrix factorization is crashing because this isn't SR-UKF.

I'm actually going to use the algorithm that is slightly different to what everyone else uses, but from the same authors, and it's the one they used in their Matlab toolbox so it should be easy to reproduce.

HipsterSloth commented 8 years ago

Just checked in a heavily reactored CustomSRUKF class that now handles state and measurement differencing and averaging. I also collapsed all but one of the private methods I copied from the original SRUKF implementation into update and predict. I was getting crazy compiler errors (the compiler was crashing trying to deal with template shenanigans). The more I look at these kalman filter implementations the more I think you're better off adapting a worked example of a filter than trying to use a generalized framework and plugging in your model (the update and predict methods aren't really that much code). I think once I get the filter debugged i'll replace the last of our usage on mherb/kalman.

HipsterSloth commented 8 years ago

All sensor values now have variance measurements that can be calibrated and are stored in each controller config.

The accelerometer, magnetometer and gyro calibration tools have been updated to compute variance and store the result along side the other config values. I think the defaults will be fine for most people, so they wont need to run the accelerometer or gyro calibration tools in most cases. But if someone is getting weird behavior we can have them run the calibration tool and look at the statistics to see if there is an anomaly.

I also tried to compute the optical position and orientation variances in the calibration mat tool. My thought being that the calibration mat tool already has the user place the controller in fixed locations and this tool most sensitive to local conditions. The specifics are as follows:

1) Record the variance of optical position (and optical orientation in the case of the DS4) and the tracking shape projection area (in pixels) at each of the 5 calibration mat locations

For an ellipsoid projection the area is pi x half_width x half_height
For the light bar we just use the area of the bounding box projection

2) Compute a best-fit line through all the (tracking area, variance) samples (5 locations x N cameras).

if you had 2 cameras you'd have 10 sample pairs

3) The best fit line parameters allow you to compute a position variance (and orientation variance for the DS4) as a function of tracker projection area.

i.e. The larger the tracking shape is on screen, the smaller the variance should be

Unfortunately the data I got back from the calibration mat didn't have an obvious trend line:

I thought maybe the samples I was picking were too close together so I tried again over a larger area (using the box my Vive shipped in as a flat surface). This at least gave a trend in the direction I would expect, but I got a few crazy high variances:

I tried again in the hope that maybe the controller was just wobbling a bit when I set it down on that particular control point, but I got a similar result:

At this point I'm thinking either this tool is still too sensitive to wobbling controllers OR the the variance is not a linear function of tracking shape area (probably quadratic). For now, I just picked a constant default for the optical position and orientation variance that seemed reasonable. I think a simple hand tuned constant is fine for the moment.

My plan Monday is to start debugging the C++ Kalman filter now that all the pieces I needed for it are in place. I'll let you know as soon as I have some interesting data to show so for it (working or not).

cboulay commented 8 years ago

Optical variance per-axis or 3-D distances? in cm? I have z-variance being about 5-6x that of x- and y-variances. They probably shouldn't be bundled together.

I haven't looked at the math in a while (and am a bit busy atm) but I don't think the relationship between x-pos and area is the same order as z-pos and area. A first guess is that x- and y-variances scale with distance^2 but z-variance scales with distance^4, but I'd have to look back at the cone fitting math to have any confidence in that.

HipsterSloth commented 8 years ago

Optical variance per-axis or 3-D distances? in cm?

max(var(x), var(y), var(z)), which is effectively just the variance on the z-axis in cm.

I have z-variance being about 5-6x that of x- and y-variances. They probably shouldn't be bundled together.

Totally agreed that they shouldn't be bundled. It just gets a bit more complicated in the multicam situation so I wanted to keep it simple for the first pass and pick an upper bound.

I haven't looked at the math in a while (and am a bit busy atm) but I don't think the relationship between x-pos and area is the same order as z-pos and area. A first guess is that x- and y-variances scale with distance^2 but z-variance scales with distance^4, but I'd have to look back at the cone fitting math to have any confidence in that.

That's a good point about the order differece. I think what I need to is pull the variance calculation stuff out of the camera pose estimation tool into it's own tool and sample the xy and z variances at a wider set of ranges and then do some curve fitting in excel.

cboulay commented 8 years ago

I just pushed a commit to client_capi branch with an implementation of the SRUKF using augmented matrices (i.e., this Appendix A.C). Everything is in place but I'm getting non positive definite matrices so it's crashing when trying to do a cholesky update of the square root state covariance matrix.

I'll have to go through line-by-line and make sure I'm not messing up any indices anywhere, but I thought I'd push it now because it might be useful to you; I tried to make it clear what I was doing at each step. But you probably shouldn't try to implement this in C++ until I fix the problem, just in case it ends up being a dead end.

The entry point is misc/matlab/client/filter_datafiles.m

HipsterSloth commented 8 years ago

I'm hitting a similar issue setting up the initial Q matrix. It's supposed to be symmetric, but it's not (trying to debug that now). Just be be sure Q_discrete_3rd_order_white_noise is suppose to compute a square symmetric matrix because it's: [.5dt^2, dt, 1][.5dt^2, dt, 1]^T \ variance right?

EDIT: I was mistaken. My initial Q was symmetric, but it's still failing to compute the cholesky of Q. I just tried the following in GNU Octive with a 3x3 symmetric matrix corresponding to a dT of 1 and a variance of 1:

>> [0.25, 0.5, 0.5; 0.5, 1.0, 1.0; 0.5, 1.0, 1.0]
ans =

   0.25000   0.50000   0.50000
   0.50000   1.00000   1.00000
   0.50000   1.00000   1.00000

>> chol ([0.25, 0.5, 0.5; 0.5, 1.0, 1.0; 0.5, 1.0, 1.0])
error: chol: input matrix must be positive definite

According to wikipedia a positive definite matrix must have positive eigen values:

>> eig ([0.25, 0.5, 0.5; 0.5, 1.0, 1.0; 0.5, 1.0, 1.0])
ans =
  -1.1179e-016
  7.6571e-019
  2.2500e+000

Looks like this formulation of Q has eigen values = 0 (or close due to numerical error). So I guess we need some different form of Q to work with SRUKF?

cboulay commented 8 years ago

The equation a bunch of posts above is not the same as the Q_discrete_white_noise function, and it returns a positive definite matrix. Here's the vectorized Matlab code. It's a little hard to read, but the . before an operator means 'elementwise'. It would probably be much easier to write it out in a couple of nested for loops, but Matlab is slow at loops so I vectorize out of habit.

function q = process_noise(N, DELTA)
% http://www.tandfonline.com/doi/abs/10.1080/00207728908910103
my_i = (1:N)';
my_j = 1:N;
my_i_plus_j = my_i(:, ones(N,1)) + my_j(ones(N,1), :);
X = 2*N + 3 - my_i_plus_j;
nums = DELTA^X;
facs = factorial(N + 1 - my_i);
denoms = (facs * facs') .* X;
q = nums ./ denoms;

cboulay commented 8 years ago

Pushed another update to the matlab code. I renamed some variables to bring them inline with the reference paper. I got through the crash in predict step. Now it's crashing at the very last step of update. So close. Too sleepy to investigate further.

cboulay commented 8 years ago

No more crashing! I wasn't adding the noise properly during process function and observation function.

The position estimate is OK but it actually seems a bit more jittery than the optical position. This is probably because the orientation estimate is terrible. I bet that some of the axis-angle math is wrong.

I need to take a step back from this and catch up with some other work for a little while.

HipsterSloth commented 8 years ago

Thanks for looking in to that. I can start porting over the newer matlab code if you don't have any objections.

cboulay commented 8 years ago

Please, go ahead.

I'm fairly confident it's a faithful representation of the algorithm presented in the paper I linked above, aided by the code in the ReBeL toolbox. That algorithm does support having different dimensionality in the state vector and the process noise (if you wanted to go back to using quaternions in the state vector), but I may have removed that in my re-implementation.

I'm pretty sure the only remaining issues are in the orientation math. Some things to think about:

What is variance? In linear-space, it's the square of the distance from the mean. In angular-space, is it the square of the magnitude of the rotation?
- Is that really what we're providing in Q?
- Is X_k_r giving us the (sqrt(variance)=) magnitude of the rotation in each axis, or something else?
- Is each axis independent?
What does it mean to scale a rotation by a weight? See, e.g., srukf_predict step 8. Do we need to do any normalization or other fixing to make it a real rotation before doing the qr decomposition?
What exactly does the qr decomposition return?
- Is it still expected to be a physical thing, and, if so, do we have to tweak the result to make sure the elements representing rotations are still representing rotations?
- Or can this be deferred until after the cholesky update?
After the cholesky update, Sx_k is the sqrt of the state covariance.
- Variables in rotation-indices along the diagonal represent magnitude of rotation. Do we have to do anything to these to make sure the magnitudes make sense? (i.e., mod(rotation, 2*pi) ?)
- Variables in off-diagonals can be sqrt(rotation_magnitude * distance). What does this mean? Does this have to be tweaked to make physical sense, or can we ignore it?
Similar to above, in the calculation of the process-observation covariance (Pxy in srukf_update step 5), each state variable residual is multiplied with each observation residual. The result is things like multiplication of an (e.g.) AxisAngle x value (i.e., a part of a rotation) with (e.g.) an optical_position Z value. Do we need to do anything special here?
The magnitudes of the rotations about each axis SHOULD BE capped at [pi,-pi). The magnitudes of the linear variances are not capped. Do we compensate for this by increasing the process noise magnitude for rotation variables?

HipsterSloth commented 8 years ago

I was just trying to walk through the process_noise method in the matlab code and the nums = DELTA^X; line confused me. At first glance I would expect that this would result in a component-wise operation, but that's not what I'm seeing in GNU Octave.

So for example if the order is 3 the matrix X looks like this:

If I do 0.1^X I would expect 0.1 to get raised to the corresponding entry in X, since that looks like what is intended in the process noise equation from earlier:

  1.0000e-007  1.0000e-006  1.0000e-005
  1.0000e-006  1.0000e-005  1.0000e-004
  1.0000e-005  1.0000e-004  1.0000e-003

But instead I get this:

   1.07251  -0.46864  -1.00979
  -0.46864   0.68688  -0.15761
  -1.00979  -0.15761   1.69457

I only get the expected matrix if I do this: 0.1.^X. Is that a bug in the process_noise function? Or do we actually want to raise a constant to a matrix power? And if so what does operation even mean in C++?

cboulay commented 8 years ago

That was a bug. Thanks.

That actually made the predicted result much worse :P but I can tweak the noise scales a bit to take care of that.

HipsterSloth commented 8 years ago

Wanted to give a quick update.

After my last post I made a bit of a detour to see if I could improve the optical orientation extraction for the DS4 using stereo triangulation of the 7 light bar points (rather than using solvePnP per camera). The solvePnP call is kind of expensive and I was worried it was going to be a major perf bottleneck (zelmon64 was hitting bad perf with multiple cameras). I finally got that working Friday (has more reliable optical yaw too!)

On Sat I switched back to porting over your matlab code into the KalmanPoseFilter. I was able to port over everything except srukf_update() before it got too late on Sunday. Hopefully I can get the rest of this ported in the next evening or two. Then I can think a bit more on the orientation math questions you posed.

HipsterSloth commented 8 years ago

Finished porting over the remainder. Thanks a ton for providing that framework! That is an intense amount of mathematical machinery you authored. Not gonna lie. Matlab makes my brain melt, but I'm glad we have that as a reference implementation that I can test this code against.

No rush on this, but when you have some time I'd love it if you could look over the SRUKF class. It definitely needs a second pair of eyes.

Some random things I wanted to call out from the porting process:

In the matlab code, I think the accelerometer calculation in the observation function is pulling from the linear velocity, not the acceleration, if I'm reading the indices correctly anyway.
While you can extract contiguous sub-blocks from a matrix in Eigen, as far as I know there is no good way to composite together disjoint rows/columns ex: states([2 5 8], :). In those instances for state and observation vectors I relied on a StateVector and MeasurementVector classes to extract the fields I needed to do fancy operations on like special_state_add.
I'm pretty iffy about whether I should use the upper or lower triangular matrices for the QR decompositions and the rank updates. In Eigen I think the R matrix is stored in the upper triangle for QR updates. I'm really not sure which half I was supposed to be using for the Cholesky rank up/downdates.
I'm super iffy about the mrdivide operator used to calculate the KalmanGain. After some digging around it looks like I should interpret A/B as A.transpose().colPivHouseholderQr().solve(B.transpose()). I'm curious if this agrees with your intent.

I think I might make a test app for just the filter similar to what you have for filter_datafiles.m It's be good to run it side by side with the matlab code and see where it diverges.

cboulay commented 8 years ago

In the matlab code, I think the accelerometer calculation in the observation function is pulling from the linear velocity, not the acceleration, if I'm reading the indices correctly anyway.

Yup. That's a mistake from 0-indexing to 1-indexing. Thanks.

Use the lower triangular matrices. Matlab functions return the upper triangular matrices but this is regarded as a quirk, as everything non Matlab returns lower. In the Matlab implementation, the upper triangular matrix Sx_k is kept in memory in the filt_struct, but that's only because it's used by another Matlab function (cholupdate) that expects upper. Whenever one of these upper triangular matrices are used for linear algebra outside of these functions (qr, cholupdate, cholesky), the transpose is taken to get the lower triangular matrix. The transpose of the upper is not identical to the lower but I think it might just be rounding errors.

There are several ways to solve the system of equations to get the Kalman gain.

In Matlab, KG = (Pxy/Sy')/Sy;
In Python 1, 2

Edit: C++ methods

Using gsl
sfwa seems too easy.
mherb

HipsterSloth commented 8 years ago

I updated the filter with the changes you suggested. The Kalman gain is now computed the same way it's done in mherb/kalman. The filter runs without crashing, but I don't see the controller show up when I do the tracking test, so I need to debug further. I think the next best step it to make a test app that runs through the pre-recorded data and compare it to what you're getting in matlab.

cboulay commented 8 years ago

OK, but what I'm getting in Matlab isn't correct.

Please take a look at the Python script to make sure the data I'm writing to disk are in fact the data we want the filter to use. The Python script could also use some tweaks to directly write the data to the csv file, instead of relying on me to first run python test_psmoveclient.py > output.csv and then edit the csv to delete the lines we don't want, but I'll only bother if the already-present data is incorrect for some reason.

We'll talk about how to get the filter working correctly after you get your test environment up and running.

n8rockerasu commented 8 years ago

I don't mean to butt into your guys work, but I just wanted to say thanks for putting so much effort and time into this. I read through all the progress you guys are making (way over my head, haha), and just wanted to let you know we all appreciate your hard work.

Is it safe to assume this is the next big priority for PSMoveService? It seems like more and more games are being released that rely on throwing things ("Grenade Skeeball" in Hotdogs, Horseshoes & Hand Grenades, Quarterback SNAP, nVidia VR Fun House, etc) and other games seem to be improving their coding to where things like velocity make a bigger difference (The Thrill of The Fight, VR Baseball). So, getting this figured out would quite literally be a game changer.

Best of luck to both of you in your work on this. And thanks again!

HipsterSloth commented 8 years ago

@n8rockerasu Yep. This is arguably the biggest hole in PSMoveService. It's also the largest technical challenge which is why it's taken up so much time. Once this is done we should have clean controller velocities we can pipe to steamvr. Fortunately we are getting close. The filter is written (both in PSMoveService and in our Matlab testbed). Now were in the process of debugging it. /fingers-crossed

HipsterSloth commented 8 years ago

@cboulay I think I came across another typo in the matlab code, but I wanted to run it past you. I wasn't getting results from sub_axang that matched what I got in c++. Shouldn't this function:

function output = quaternion_conjugate(quat)
output = quat;
output(2:end) = -quat(2:end);

Actually be this:

function output = quaternion_conjugate(quat)
output = quat;
output(2:end,:) = -quat(2:end,:);

The former appears to negate the w component of some columns when quat is a matrix rather than a column vector.

HipsterSloth commented 8 years ago

Found another issue with the dt. I was getting way different positions in the state vectors (in X_k) generated from the process function in matlab compared to the c++ despite having the same input augmented Sigma point matrix X_t. The only clear thing that would cause this would be if the time delta was different. I inspected the time deltas in filter_datafiles.m and noticed I was getting a crazy huge dt of 30:

%% Filter
%Pre-allocate output variables
observations = testdata(:, 1:Odim); %% I think this is wrong
predicted_state = nan(size(observations, 1), Xdim);
o_ix = 1;
%%
for o_ix = 1:size(observations, 1)
    %%
    dt = observations(o_ix, end) - filt_struct.last_obs_time;

From the looks of the code it appears like we're trying to treat the last column of the observations table as a timestamp, but the last column of the observations table is actually the z-value of the position measurement. I think we actually want the observations matrix to be this instead:

observations = testdata(:, [1:Odim,end]);

Where the end column from the test data is the timestamp. When I run the sample with this change it looks like we're over smoothing the data?

cboulay commented 8 years ago

Yes, those were both errors. Thank you.

After fixing them, my positional estimate now matches perfectly the true positions. Too perfectly actually. Well, if the r_scale (process noise) <= q_scale (observation noise), then they seem to match perfectly. If r_scale > q_scale, then the prediction introduces unreasonable noise.

ukfposition_corrected

Not shown here, but the orientation estimate is still a mess.

HipsterSloth commented 8 years ago

I figured out a technique this evening for comparing intermediate matrices in C++ to their corresponding matrices in matlab. I wanted to run it by you in case you knew of a better way to do this. Previously I was just spot checking them, but I kept missing differences (and it was taking forever):

Copy a matrix out of the watch window in MSVC into notepad++:

        [0] -0.10000000000616356    double
        [1] 0.00000000000000000 double
        [2] 0.00000000000000000 double
        ...

Do a quick search and replace to create a column vector I can paste into matlab:

Sx_k_cpp = [
-0.10000000000616356;
0.00000000000000000;
0.00000000000000000;
...
]

Paste the column vector into matlab and subtract it from a column vector version of the matrix I want to compare it against. Computing a sum of square errors gives me a general difference:

sqrt(sum((Sx_k_cpp - reshape(Sx_k, 15*15,1)).^2))

If I want to see the location largest difference I do this:

[num idx] = max(abs(Sx_k_cpp - reshape(Sx_k, 15*15,1)))
[row col] = ind2sub(size(Sx_k), idx)

With this I found a few more bugs in how I was computing the state sigma points and residuals in update(). I'm now getting the same results up to the point where I do the cholesky update/downdate.

Unfortunately I can't seem to get the same results as the downdate no matter how I adjust it:

Original Matlab

%filt_struct.weights.w_qr and .w_cholup cannot be negative.x
[~, Sx_k] = qr((filt_struct.weights.w_qr*X_k_r(:,2:nsp))', 0);
%NOTE: here Sx_k is the UPPER Cholesky factor (Matlab excentricity)
if filt_struct.weights.wc(1) > 0
    Sx_k = cholupdate(Sx_k, filt_struct.weights.w_cholup*X_k_r(:,1), '+');
else
    % deal with possible negative zero'th covariance weight
    Sx_k = cholupdate(Sx_k, filt_struct.weights.w_cholup*X_k_r(:,1), '-');
end

C++

// Perform additional rank 1 update/downdate (depending on sign of wc[0])
float wc0_sign = sgn(W.wc(0));
Sx_k.selfadjointView<Eigen::Upper>().rankUpdate(X_k_r.leftCols<1>(), W.w_cholup*wc0_sign);

Is there anything that stands out to you there that I might be doing wrong?

BTW it turns out the R matrix of the QR decomposition we use for Sx_k is stored in the upper triangular portion of the QR result. See notes here. So I do the following to get Sx_k pre-cholesky update:

Sx_k = qr.matrixQR().topLeftCorner<X_DIM, X_DIM>().triangularView<Eigen::Upper>();

cboulay commented 8 years ago

Your code for comparing the two matrices could be made a bit more Matlabic, but it looks correct.

Sx_k_cpp = nan(15);
Sx_k_cpp(:) = [<pasted values>];
sqe = (Sx_k_cpp - Sx_k).^2;
total_dist = sqrt(sum(sqe(:)));
[row, col] = find(sqe == max(sqe(:)));

As for the difference between Matlab's and Eigen's cholesky downdating, I have no idea. I haven't looked at Eigen's way of doing this. I know that Matlab returns lower triangular, and that it is undesirable generally, so you don't want to emulate Matlab. If anything, you should be emulating what others have done in C++, using the upper triangular, then transforming the Matlab results to upper-triangular for comparison.

Late last week, I noticed another error in my Matlab code. I made a bunch of other changes too, trying to go back to using quaternions in the state variable because that's what the other quaternion papers have done that I now understand enough to emulate, but that's still in progress. Tonight I'll try to clean it up as much as possible, remind myself where the error was and let you know.

cboulay commented 8 years ago

I refactored a lot of the Matlab code in the latest commit.

The orientation is still wrong. I'm pretty sure it's correct for the predict step, but it's wrong in the update step. The 'innovation' is way too big. Seeing as how the the predicted orientation is pretty close to the original orientation (at least for the first iteration), the predicted observation should probably be pretty close to the true observation, but they are very different. The observation function must be wrong. (That might not be the only thing that's wrong)

Can you please double-check and tell me the units and reference frame being returned by ControllerState.PSMoveState.CalibratedSensorData (for each of the 3 sensors)?

HipsterSloth commented 8 years ago

Thanks for the updates! I'll start porting those over to the C++ code tomorrow evening.

All three calibrated values should be given in the controller's frame of reference. The calibrated accelerometer and gyro values for the psmove controller ultimately come from the poll method. The units of the accelerometer should be in Gs and the gyro readings should be in radians/second.

I can verify that the gyro is in radians/s with the test_console_CAPI client reports the gyro readings as x=-2.64938807 y=3.74107552 z=-14.8763361 when rapidly spinning around the PSMove. Those numbers would be much larger if they were in degrees/s.

The calibrated magnetometer values come from further down in the poll method. The magnetometer measurement should be a unit vector. It's possible that the magnetometer identity direction you are using might need to be rotated 90 degrees (as if the controller was lying flat when reading the identity direction rather than standing upright).

I feel like when I was porting over the update() function some of the world -> controller space quaternion transforms we're backwards (should have been multiplying by the conjugate of orientation to go from world to local), but I wasn't sure. When I'm porting over the updated matlab code I'll be sure to give that another look.

In other news, I finally figured out the right expression to use for computing the Kalman gain (output now matches matlab). That was the last big piece that I need to get the C++ code aligned with matlab:

        //%% 5. Calculate Kalman Gain
        const Eigen::Matrix<double, 1, nsp> &wc = W.wc;
        const Eigen::Matrix<double, X_DIM, nsp> wc_repl = wc.replicate<X_DIM, 1>();
        const Eigen::Matrix<double, X_DIM, O_DIM> Pxy= 
            X_k_r.cwiseProduct(wc_repl).eval() * Y_k_r.transpose();

        // In the Matlab code: KG = (Pxy / Sy_k')/Sy_k
        // where "/" is the "mrdivide" operator, 
        // x = B/A solves the system of linear equations A*x = B for x. 
        // I arrived at the following through trial and error
        Eigen::Matrix<double, O_DIM, X_DIM> numerator = Sy_k.transpose().colPivHouseholderQr().solve(Pxy.transpose());
        Eigen::Matrix<double, X_DIM, O_DIM> KG = Sy_k.colPivHouseholderQr().solve(numerator).transpose();

Hopefully in the process of porting over your latest matlab changes something will pop out at me with the predicted observation code.

EDIT: Fixed magnetometer code link

cboulay commented 8 years ago

BTW, I just noticed that I wasn't handling distance units correctly. Acceleration due to gravity is in m/s^2, but all other distance measures were in cm. I converted everything to m (not pushed yet).

Question for you: If we have a controller's .json file containing information about its magnetometer calibration (Center, BasisX, BasisY, BasisZ, Identity, Error), and we know the controller's orientation quaternion, how do we reproduce the values obtained in CalibratedMag (or even RawMag)?

HipsterSloth commented 8 years ago

The controller's current orientation is the amount you have to rotate the Identity magnetometer direction to align it with the current CalibratedMag reading:

CalibratedMag= Orientation^-1 * [0; MagIdentity] * Orientation

Or stated another way: if the controller is in the same orientation as it was during magnetometer calibration, Orientation should be [1, 0, 0, 0], because the current CalibratedMag reading will align with MagIdentity.

The CalibratedMag value is computed by passing RawMag into eigen_alignment_project_point_on_ellipsoid_basis, which:

1) Subtracts RawMag from the ellipsoid center 2) Projects the re-centered reading onto the ellipsoid basis vectors

NOTE: The basis is currently just the identity matrix * bounding box extents

~~3) Re-normalizes the projects vector onto the surface of the ellipsoid~~

NOTE: I just noticed I don't do this in poll(). I normalize CalibratedMag as part of the orientation filter update. This means CalibratedMag isn't quite a unit vector (it should be close) and you'll want to normalize it in your matlab code.

So if you wanted to go from CalibratedMag back to RawMag you'ld basically just create a transform out of the ellipsoid basis and center and multiply CalibratedMag by that to get RawMag.

cboulay commented 8 years ago

Is the identity orientation actually the one used during calibration (i.e., upright with the trigger facing 'forward'), or is it a rotation from that? In my trainingdata, obtained while the controller was upright, the orientation is typically somewhere around [0.5, 0.5, 0.5, -0.5] (equivalent to -90deg rotation about z followed by a 90 degree rotation about x: quaternion_multiply(axisAngle2Quat([pi/2 0 0]), axisAngle2Quat([0 0 -pi/2]))). I'll have to recollect trainingdata and testdata immediately after calibration to make sure yaw-drift isn't contributing here.

Is the Magnetometer.Identity x,y,z (in the json file) that was measured during calibration saved in world reference frame (x=right, y=up, -z=direction trigger was facing during calibration) or in the controller reference frame? If it's in the controller reference frame, is this the true reference frame during calibration or the transformed reference frame?

I know I can probably find the answers to these questions by digging through the code, but I thought you might know the answers off-hand.

cboulay commented 8 years ago

Do we have an image showing the controller axes? The Wiki says TO DO: IMAGE DEPICTING EACH SENSOR AND THE XYZ ORIENTATION. ALSO SENSOR OFFSETS. Are we using the same axes as Alex here?

In the magnetometer calibration wiki page, there's an image ('waiting for stabilization') that shows the world reference frame (maybe not at its origin, but the direction should be accurate). What is the correspondence between the arrow colours and 'x', 'y', 'z'? These colours are different to the ones used in the screenshot immediately below.

HipsterSloth commented 8 years ago

Sorry for the delay, I was away at lunch then had a meeting.

Is the identity orientation actually the one used during calibration (i.e., upright with the trigger facing 'forward'), or is it a rotation from that?

The identity orientation is the controller pointing forward at the screen. This is defined in init_filters_for_psmove with the call to setCalibrationTransform and setSensorTransform. These transforms only affect the orientation filter. Specifically the PoseFilterSpace class transforms the "Calibrated" sensor data for passing it into the filter.

Is the Magnetometer.Identity x,y,z (in the json file) that was measured during calibration saved in world reference frame (x=right, y=up, -z=direction trigger was facing during calibration) or in the controller reference frame? If it's in the controller reference frame, is this the true reference frame during calibration or the transformed reference frame?

The Magnetometer.Identity x,y,z in the json file should be in the controllers reference frame. It is an average of the calibrated magnetometer readings. It should be in the same space as the accelerometer and the gyroscope.

Do we have an image showing the controller axes?

We don't sadly. Never got around to making that. I don't think that the axes in either of those images is correct (I think my diagram is actually for the axis in Unreal if memory serves). My magnetometer identity (measured with the controller standing vertically) is:

                "X": "-0.347566158",
                "Y": "0.932685435",
                "Z": "0.0964140743"

This implies the the controller reference frame has +Y going out through the bulb. I think +X should be right (out through the start button) and +Z should be out through the PSMove button. This could be verified by running test_controller.exe, shaking the controller, and watch the accelerometer values.

EDIT: Sorry I can't say what the axes are definitively. I figure it out and then forget to write it down :(. I had to go back and look over the code myself.

cboulay commented 8 years ago

I just tested PSMoveService test_controller and psmoveapi test_calibration. The axes directions were the same for both when looking at accelerometer data.

When the controller was horizontal, with the start button to the sky, x = +4000 (test_controller) or +1 (test_calibration).
When the controller was vertical, bulb to the sky, y = +4000 / +1
When the controller was horizontal, Move button to the sky, z = +4000 / +1

For psmoveclient.py

Horizontal, Start to the sky, x=+4000(raw)/+1(calibrated)
Vertical, bulb to the sky, y
Horizontal, Move to the sky, z

So it's the same.

For the magnetometer, in the json file, Magnetometer.Identity = 0.234017432, 0.873125494, 0.42765367 In psmoveclient.py, the calibrated magnetometer reading was roughly similar when standing upright: 0.219, 0.964, 0.491 If I rotate the controller horizontal pointing toward the camera, Move button up, 0.288, -0.311, 0.962 If I rotate it horizontally, pointing toward the camera, Start button up: 1.100, -0.158, -0.171 Horizontal, Move up, pointing away from the camera: -0.420, 0.439, 0.935

cboulay commented 8 years ago

Progress!

I think I understand why I was confused about the orientation. PSMoveState.Pose.Orientation returns the rotation from the null pose where the null pose has the controller horizontal pointed at the camera. All of the calibrated sensor values have the null pose as the controller upright. So when I was using the PSMoveService-provided orientation to check my algebra, I was getting misled.

I now do 3 things to the observation data from PSMoveService before I use it. First I scale position data from cm to m. Second I normalize the magnetometer readings. Third I change the null-rotation, though this only matters for when I'm comparing results.

Now the orientation estimate is sane, though inaccurate, for a few seconds then blows up. The position estimate still has the problem that it matches the optical-estimate position too closely. So maybe all that's needed is to tweak the Q, R, and other parameters.

It also seems as though there are at least as many different ways to incorporate the noise as there are papers. According to the original SRUKF, it can be used to extend the covariance matrix before doing the qr decomposition. The updated SRUKF from the same authors, which the Matlab code is closest to, uses an augmented matrix and add the noise in the process/observation function (not clear if process noise should be added before or after sigma point propagation). The paper by Kraft and the thesis by Eneyati quaternion_multiply in the process noise before the update i.e. q_new = q_old * q_noise * q_delta.

cboulay commented 8 years ago

I didn't finish that last post... I pushed these changes, along with some new training/testing data. The next time I work on this (probably not tomorrow), I'll try different ways of incorporating noise, and also try to change the noise parameters.

HipsterSloth commented 8 years ago

Thanks for all your work on this! I'm in the middle of trying to port this over to C++ now (probably going to take me a few days, hopefully can finish by this weekend).

I think I understand why I was confused about the orientation. PSMoveState.Pose.Orientation returns the rotation from the null pose where the null pose has the controller horizontal pointed at the camera. All of the calibrated sensor values have the null pose as the controller upright. So when I was using the PSMoveService-provided orientation to check my algebra, I was getting misled.

Yeah sorry about that. That's what I was trying to get at with my "PoseFilterSpace" comment above, but it's all pretty confusing. I have to keep looking at the code to keep it straight in my head. What you call "null pose" I have been calling "identity pose". I'm not married to that name though. Do you think null pose would be a more intuitive name to use in the code?

It also seems as though there are at least as many different ways to incorporate the noise as there are papers.

Yeah I think the problem with all the different ways applying orientation noise is that the end result for each method probably looks similar. I'm not sure how you would evaluate which method yields the best result (which methods has the fastest converging S matrix?).

On that note, while looking at how we now apply the angle-axis portion of S to X_t, I noticed something that seemed off here:

% But some of the state variables represent orientations, and cannot be
% simply added or subtracted.
x_qinds = 10:13;
q = filt_struct.x(x_qinds);  % State orientation as quaternion
s_ang = [10 12 14];
S_a = filt_struct.weights.zeta * filt_struct.S(s_ang, s_ang);  % sqrt orientation covariance
S_q = axisAngle2Quat(S_a);
X_t(x_qinds, 1 + s_ang) = quaternion_multiply(q, S_q);
X_t(x_qinds, 1 + filt_struct.Sdim + s_ang) = quaternion_multiply(q, quaternion_conjugate(S_q));

If I set a breakpoint on S_a and inspect it (on the first sample) I get this:

debug> S_a
S_a =

   0.032311   0.000000   0.000000
   0.000000   0.032311   0.000000
   0.000000   0.000000   0.032311

~~Shouldn't S_a have 15 colomns, not 3?~~ Actually, can we do this because this is the only portion of the X_t that will be affected by adding/subtraction zeta*S (since S all zero everywhere except that section)?

cboulay commented 8 years ago

It's true that Q and R will have only zeros in the off-diagonals so I can do as above, but I suppose this isn't true of S after the first iteration; it probably should have 15 columns.

I was plugging my ears and closing my eyes about the possibility that S will have covariances between orientation variables (axisAngles) and other variables (e.g., x-position). I didn't know whether or not I should treat those as orientations that could be converted to quaternions to add to the state. I'm trying that now. I'm also cleaning up the code, putting all the linear variables together so block indexing is easier, making the variables more consistent, and adding a lot of comments.

psmoveservice / PSMoveService

Sensor Fusion between IMU and optical tracker #13