Bias figure - Githubissues

csu-hmc / inverted-pendulum-sys-id-paper

A paper that compares human quiet standing controller parameter identificaton with various methods.

1 stars 0 forks source link

Bias figure #1

Open moorepants opened 9 years ago

moorepants commented 9 years ago

@tvdbogert

This is an attempt at making a plot of the relative error in the identified gains given reference noise (i.e. unmeasured process noise or noise in the human's ability to estimate the feedback error) and platform perturbation level. There is no output measure noise. This confirms the conclusions in the van der Kooij work, but gives at idea of the magnitudes of the platform acceleration needed to directly identify the gains given some level of reference noise. I've played around with a bunch of ways to plot this, and this seems to work best. So far I'm plotting each gain in a subplot showing a heatmap of the relative error in the gain identification. The cap of the error colors is 0.5 in this case and you'll see "blanks" if the error is above 0.5 (the gain in the bottom left has one that is 233% error). I cap them so I can use the same scale colorbar for each plot. I can also filter the heatmap to give averages around pixel groups so that some of the spikes are toned down. Anyways, is this a good way to plot this? I'm going to make the same plot for indirect ID. The gain error in indirect id should only be affected by the reference noise level and not the platform accel level.

first-attempt-at-direct-id-bias

moorepants commented 9 years ago

What do you think would be a good range for the reference noise? Do you know of any literature that may try to estimate this?

van der Kooij lumps all of this into the reference noise:

Fluctuations in the intention and the interpretation of the task can be modeled with a Gaussian noise (v_sensor) on the reference signal. Also sensory signal noise or other noise sources in the feedback loop can be modeled with this noise v_sensor that cannot be measured directly. In the subsequent paragraphs we will call the noise v sensor the sensor noise although it captures both noise on the reference and within the feedback loop.

I can't think of a way that you'd even make an experiment to estimate that. You'd somehow have to have a measure of the estimated state by the human's system.

I'm also lumping the reference noise for angles and angular rates together and using the same value. Should I use two reference noise ranges, one for the angles and one for the rates? I can just plot two x axes that give the ref_noise_ang and ref_noise_ang_rate. What would be a good ang_rate range?

tvdbogert commented 9 years ago

I like the way these plots look, but they are noisy and probably time consuming to produce. Especially if you want to have enough data to do some averaging over the noisy results. The system is nearly linear (for small movements) so I think you will find that the bias will only depend on the ratio between reference noise and perturbation. In other words, the color does not vary along a straight line that goes through the origin.

So I think you could plot them just like Samin did. Keep reference noise constant and only vary the perturbation amplitude. Only the ratio matters.

Another reason for doing that is that you can actually find a realistic human value for reference noise and just use that (your last question). Samin's report may not have all that detail, but she looked up some literature about joint motion amplitude during quiet (unperturbed) standing. Then she scaled the amplitude of the reference noise so that the model produced the same amplitude in the same condition.

Yet another reason for varying only the perturbation amplitude is that this may make it feasible to also process a representative number of cases by the indirect approach (which is more time consuming). If your cases are spread over a 2D grid, that could take too long and not really add new information.

About the reference noise. In the end all your sources of reference noise get scaled and added up by the (linear controller) and appear as extra torque at the output of the controller. That final result is all that matters. That's why Samin only injected noise at that point in the system. And then I would call it "controller noise" rather than reference noise. It does not matter where the noise originates, it only matters that there is a part of the controller output that is noisy and not correlated to the state of the plant.

I think there's no reason to have separate noise sources for angles and angular rates. Their combined effect on the controller output is all that matters.

Of course this is only the case for a linear controller. In the nonlinear case, you would need to consider separate sources of noise. But I don't think we even want to use the direct approach anymore, after we demonstrate the bias.

moorepants commented 9 years ago

Thanks for the feedback.

I like the way these plots look, but they are noisy

The noise is easily dealt with using a filter, for example:

first-attempt-at-direct-id-bias-filtered-5-constant

and probably time consuming to produce. Especially if you want to have enough data to do some averaging over the noisy results.

It's not so bad. My simulation runs in 1.5s. This plot has 2500 data points, so it takes an hour to produce the data. The direct id is fast. With direct collocation, that takes another maybe 5 seconds at the most per run. So that plot will take an additional 3.5 hrs.

The system is nearly linear (for small movements) so I think you will find that the bias will only depend on the ratio between reference noise and perturbation. In other words, the color does not vary along a straight line that goes through the origin.

The higher accelerations are not necessarily small movements. These all come from simulating the non-linear system.

So I think you could plot them just like Samin did. Keep reference noise constant and only vary the perturbation amplitude. Only the ratio matters.

Yeah, I forgot about the fact that the ratio is all that matters. Here is that plot for the same data you see in the plots above.

first-relative-error-vs-noise-ratio

Same kinda curve as Samin's, but the relative error is never much greater than 1% except for the zero acceleration cases. That makes me think that it may not matter how much you perturb the model, you'll still get good estimates. All that matters is that you perturb it. Can we assume that for the gait direct id?

Another reason for doing that is that you can actually find a realistic human value for reference noise and just use that (your last question). Samin's report may not have all that detail, but she looked up some literature about joint motion amplitude during quiet (unperturbed) standing. Then she scaled the amplitude of the reference noise so that the model produced the same amplitude in the same condition.

I'll look for the numbers she used. I've questioned this before, but I guess that now makes sense.

Yet another reason for varying only the perturbation amplitude is that this may make it feasible to also process a representative number of cases by the indirect approach (which is more time consuming). If your cases are spread over a 2D grid, that could take too long and not really add new information.

Good point.

About the reference noise. In the end all your sources of reference noise get scaled and added up by the (linear controller) and appear as extra torque at the output of the controller. That final result is all that matters. That's why Samin only injected noise at that point in the system. And then I would call it "controller noise" rather than reference noise. It does not matter where the noise originates, it only matters that there is a part of the controller output that is noisy and not correlated to the state of the plant.

I've been unclear about the noise source descriptions from van der Kooij and how the noise sources interact for over a year I suppose. I may finally get it. I guess this feels like going in circles for you. Sorry.

moorepants commented 9 years ago

I dug around in Samin's code and found this for computing the noise she adds to the torques:

c     = 0.5;                    % coefficent of dW2 and dW3 (controller noise)
Delta = 2^(-11);                % Euler-Maruyama timestep
delta = Delta;                  % Brownian path has timestep delta
dW2    = c*randn(M,1)./sqrt(delta);
dW3    = c*randn(M,1)./sqrt(delta);

This gives a range of about -100 to 100 for the dW2 and dW3 values:

In [11]: 0.5 * np.random.randn(1e6) / sqrt(2.0**(-11))
Out[11]: 
array([ 21.18715423, -31.9240134 , -19.30520458, ...,  19.28008861,
         3.77578202, -40.19964438])

In [12]: a = _

In [13]: a.shape
Out[13]: (1000000,)

In [14]: a.mean()
Out[14]: -0.013988174997315502

In [15]: a.std()
Out[15]: 22.607616627819166

In [16]: a.max()
Out[16]: 103.77141390493054

In [17]: a.min()
Out[17]: -106.89393635272903

In her report she cites the Park paper about picking numbers that are realistic for unperturbed movement, but I can't find anything in the Park paper that gives numbers for unperturbed movement. (They do say they measured it, but don't really report on that.)

I'm adding noise at the reference so to get an estimate of values based on her numbers (using my state ordering in the K matrix):

In [39]: K
Out[39]: 
array([[ 950.,  175.,  185.,   50.],
       [  45.,  290.,   60.,   26.]])

In [40]: np.linalg.lstsq(K, np.array([25.0, 25.0]))
Out[40]: 
(array([ 0.00789743,  0.08103297,  0.01601141,  0.00709115]),
 array([], dtype=float64),
 2,
 array([ 991.17221306,  278.97785585]))

So that is the approximate standard deviation needed for states [ankle angle, hip angle, ankle rate, hip rate]: [ 0.00789743, 0.08103297, 0.01601141, 0.00709115] to give the same std as her total torque noise. Maybe I'll just go with 0.02 rad and 0.02 rad/s (1.6 deg) which is about the maximum I'm using in the graphs above. It seems like an std of 1.6 deg in angle estimation error is quite high. If we were off 1.6 degrees consistently, we'd be hard pressed to do anything correctly. It isn't clear why Samin chose these values.

moorepants commented 9 years ago

I spent too many hours yesterday cleaning up my code for this so I can run all the indirect id's too and I'm now not sure if I introduced a bug. So now I ran ~10000 trials with different platform acceleration and reference noise, just like above (I didn't fix the reference noise). It takes 1 hour to generate 10000 trials on my machine with 4 cores. Here is the heatmap plot (now shown 0 to 100% relative error and any trial above 100% is green: 10000-trials-direct-id-heatmap So this looks similar to above and gives the expected results. For any given reference noise, increasing the platform acceleration improves the identification.

Now I make the same 2D line plot as above that plots all of these values against the ratio of platform acceleration to reference noise and I don't get that nice looking result anymore: 10000-trials-direct-id-vs-ratio If I filter the data (2D moving average of 5 nearest neighbors), then it seems like there may be some indication of the nice lines that appeared above: So I'm not sure if I now have a bug. I changed the data generation code to speed it up and I changed the plotting code to modularize it. But if this above graph is correct, then it seems like it isn't only the ratio of that matters. I didn't write unit tests for all this stuff so I don't have an easy way to find the difference in the outputs. I'm going to stop wasting time on this now and get all the indirect id's running.

moorepants commented 9 years ago

Ok, I've now created the same two graphs with the results from indirect identification with direct collocation. It takes about 2 hrs and 20 mins to run this computation for 10200 trials. I use the known gains and the measured state trajectories as an initial guess, because all we are worried about is what it converges too, not if it can get there from arbitrary initial guesses. There is no noise added to any measurements, in this case the states and the platform accel.

I assumed that we'd see that the identification gets worse as ref noise increases. Our model does not have a process noise term so there is error embedded in the state measurements that come from that unmodeled process. Secondly, I would except that the error in the gains would stay constant with respect to the platform acceleration magnitude because we have modeled the feedback loop in the identification model. But... 10000-trials-indirect-id-heatmap we don't see that. In fact, this is a similar result to the direct id. The only major difference is that the identified gains are never higher than 100% error (although the ~0 acceleration may be but i haven't tagged id's that time ipopt out yet, which are some of the ~0 accels).

The (filtered) ratio graph looks like: 10000-trials-indirect-id-vs-ratio-filtered

So I'm confused why this result isn't different than the direct id result. Or maybe I'm confused as to what I'm trying to accomplish here. I've been under the impression that indirect id of a closed loop system controller would not be affected by the amount of perturbation applied.

@tvdbogert Can you comment with your wisdom? Thanks.

tvdbogert commented 9 years ago

(this was in my Drafts folder since Feb 13, it was never sent -- some of this may be moot now -- will go over your more recent things tomorrow)

It would be important to know why your current code gives results that are different, if you run the same protocol and plot the results the same way.

Here some other ideas:

Samin averaged 500 trials and even then there was considerable noise on the average. But there was a clear trend as a function of the ratio and it leveled off at a non-zero bias. Averaging 5 trials may not be enough.
For larger deviations from vertical, your system is not linear and it's not just the ratio that matters.
For very large perturbations (which are guaranteed to happen sometimes if you have Gaussian acceleration and given enough time) the system may not even be stable. To guarantee stability, the proportional gain would have to be large enough to raise the body up from a completely horizontal posture.
Scatter plots with huge numbers of points will always draw your attention to the outliers. Results may be a lot better than they look. How to (visually) integrate all data. Smaller dots would help, but there are so many. Can you convert the plots with ratio on the hor. axis to a heat map?

You could even consider simulating a linearized approximation of the pendulum to eliminate issues 2 and 3.

Antonie J. (Ton) van den Bogert Parker-Hannifin Endowed Chair in Human Motion and Control Director of Graduate Program Department of Mechanical Engineering Cleveland State University

On 2/11/2015 8:20 PM, Jason K. Moore wrote:

I spent too many hours yesterday cleaning up my code for this so I can run all the indirect id's too and I'm now not sure if I introduced a bug. So now I ran ~10000 trials with different platform acceleration and reference noise, just like about (I didn't fix the reference noise). It takes 1 hour to generate 10000 trials on my machine with 4 cores. Here is the heatmap plot (now shown 0 to 100% relative error and any trial above 100% is green: 10000-trials-direct-id-heatmap https://cloud.githubusercontent.com/assets/276007/6159882/c7fb6b96-b210-11e4-898d-0e5e2b23c8ff.png So this looks similar to above and gives the expected results. For any given reference noise, increasing the platform acceleration improves the identification.

Now I make the same 2D line plot as above that plots all of these values against the ratio of platform acceleration to reference noise and I don't get that nice looking result anymore: 10000-trials-direct-id-vs-ratio https://cloud.githubusercontent.com/assets/276007/6159910/1bffd04c-b211-11e4-8102-b60ca01c1b4b.png If I filter the data (2D moving average of 5 nearest neighbors), then it seems like there may be some indication of the nice lines that appeared above: 10000-trials-direct-id-vs-ratio-filtered https://cloud.githubusercontent.com/assets/276007/6159936/55f56af0-b211-11e4-83ff-5c4d86577b9f.png So I'm not sure if I now have a bug. I changed the data generation code to speed it up and I changed the plotting code to modularize it. But if this above graph is correct, then it seems like it isn't only the ratio of that matters. I didn't write unit tests for all this stuff so I don't have an easy way to find the difference in the outputs. I'm going to stop wasting time on this now and get all the indirect id's running.

— Reply to this email directly or view it on GitHub https://github.com/csu-hmc/inverted-pendulum-sys-id-paper/issues/1#issuecomment-74000354.