Optimize tracking loop parameters

bgottula commented 7 years ago

The default loop parameters are pretty conservative and result in a very slow response rate. Experiment with different parameters to see if a stable loop with a wider bandwidth is possible.

bgottula commented 7 years ago

I did a little bit of experimentation with the loop parameters in blind tracking mode. I found that increasing the loop bandwidth to 1.0 Hz or above caused the loop to oscillate, even when the damping factor was increased to 10 or higher. The highest bandwidth I was able to use without obvious oscillations was 0.7.

I think the next step is to measure the step response of the system and compare to the ideal step response curve. If they are drastically different it probably means that the linear system model of the PLL that was used to derive the loop filter coefficients is not appropriate. At that point the model must either be modified to try to match the actual system behavior or the loop filter proportional and integrator must be optimized by empirical methods.

bgottula commented 6 years ago

Most of the problems with the control loop response were probably due to the slow acceleration rate of the NexStar 130SLT mount's motors and the long command latency. I measured the step response of the Losmandy G11 mount and found that it had near perfect match with theory for a 0.5 damping factor. The G11 servo motors are much more powerful and can accelerate very quickly and the Gemini 2 computer has very low command latency.

This task is still valid because I have not attempted to optimize the loop bandwidth or damping factor for the new mount. The goal should be to select parameters that minimize the time to convergence within some target error bound without causing loop instability or other pathological behavior.

bgottula commented 6 years ago

Inspection of telemetry from tracking on 21 May indicates that the output of the optical error source is a bit noisy, and this noise resulted in some jerkiness in the tracking performance which was visible in the slew rate telemetry. There may be ways to reduce noise from the optical error source (which would be the ideal solution) but I think it should also be possible to reduce the impact of this noise on tracking performance by reducing the loop bandwidth. If this approach is taken some experiments should be performed to ensure that a smaller loop bandwidth still allows the mount to "keep up" with a target that is accelerating quickly in one or both axes, for example a satellite that passes somewhat near the celestial pole.

Another option would be to apply a low-pass filter to the error source output. I am skeptical of this approach because low-pass filters inherently introduce group delay (latency) which may cause the loop to be less responsive and possibly less stable. I'm not sure how this trades against the reduced responsiveness that would result from decreasing the loop bandwidth.

bgottula commented 5 years ago

During the two August 2018 observing sessions the mount lagged behind some targets pretty badly, therefore marking this issue as a blocker. The tracking loop is second order so in theory there should be no problem maintaining zero steady state position error for constant velocity. Perhaps the loop with the current settings can't handle the acceleration. If that's the case we could either increase the loop bandwidth (which might exacerbate tracking jitter) or try a 3rd order loop. I don't have any experience with 3rd order loops so I'm hesitant to go there--2nd order loops are tricky enough--but if we have to we have to.

Another possibility is that the latency of the Logitech webcam is high enough that the software is tracking the position of the object at some time in the past. I don't think this is the case because I'm pretty sure I could see the object drifting off center in the webcam view. It should be possible to confirm this by looking at optical error telemetry from recent observing sessions.

bgottula commented 5 years ago

Additional thoughts:

The integrator of the loop filter is currently clamped any time the slew() method on the mount reports that a limit was enforced. I think this may be overly aggressive and may be interfering with proper loop behavior. A limit on the integrator is still prudent, but perhaps we should instead clamp it to the max supported slew rate every time and not to whatever slew rate the slew() method returns, which could be much lower than the max due to acceleration limits.
The default acceleration limits on the G11 mount may be too conservative. This may have been set to conservative values when I thought our software was responsible for frequent motor stalls, but I'm not sure this was ever actually the case.
It's unlikely but possible that machine precision is preventing the loop filter integrator from accumulating properly.

bgottula commented 5 years ago

During blind tracking testing with only the declination axis today I noted that for many scenarios the mount will slew quickly towards the target, overshoot, and then return in the opposite direction, but then things grind to a near halt and it takes an inordinate amount of time for it to eliminate the last little bit of error. This pretty clearly rules out anything specific to either error source since I was watching printouts of the error term and could plainly see that it was painfully non-zero for an extended period.

bgottula commented 5 years ago

I'm also thinking that the current approach to setting the integral and gain terms in the loop filter may not be appropriate, since the math is based on the assumption that the slew rate takes effect instantaneously. This is probably why the mount oscillates when the loop bandwidth is set even moderately high. The actual response characteristics of the mount (max acceleration and latency, for example) may need to be modeled properly.

Alternatively, we could find optimal loop filter coefficients by some kind of optimization algorithm that runs tests with the actual hardware. The cost function would be the elapsed time until the error magnitude falls below some threshold and stays below it for some duration. This process could be time consuming but should only need to be run once for a given hardware configuration and could be automated to a large extent.