Closed bgottula closed 4 years ago
I did a little bit of experimentation with the loop parameters in blind tracking mode. I found that increasing the loop bandwidth to 1.0 Hz or above caused the loop to oscillate, even when the damping factor was increased to 10 or higher. The highest bandwidth I was able to use without obvious oscillations was 0.7.
I think the next step is to measure the step response of the system and compare to the ideal step response curve. If they are drastically different it probably means that the linear system model of the PLL that was used to derive the loop filter coefficients is not appropriate. At that point the model must either be modified to try to match the actual system behavior or the loop filter proportional and integrator must be optimized by empirical methods.
Most of the problems with the control loop response were probably due to the slow acceleration rate of the NexStar 130SLT mount's motors and the long command latency. I measured the step response of the Losmandy G11 mount and found that it had near perfect match with theory for a 0.5 damping factor. The G11 servo motors are much more powerful and can accelerate very quickly and the Gemini 2 computer has very low command latency.
This task is still valid because I have not attempted to optimize the loop bandwidth or damping factor for the new mount. The goal should be to select parameters that minimize the time to convergence within some target error bound without causing loop instability or other pathological behavior.
Inspection of telemetry from tracking on 21 May indicates that the output of the optical error source is a bit noisy, and this noise resulted in some jerkiness in the tracking performance which was visible in the slew rate telemetry. There may be ways to reduce noise from the optical error source (which would be the ideal solution) but I think it should also be possible to reduce the impact of this noise on tracking performance by reducing the loop bandwidth. If this approach is taken some experiments should be performed to ensure that a smaller loop bandwidth still allows the mount to "keep up" with a target that is accelerating quickly in one or both axes, for example a satellite that passes somewhat near the celestial pole.
Another option would be to apply a low-pass filter to the error source output. I am skeptical of this approach because low-pass filters inherently introduce group delay (latency) which may cause the loop to be less responsive and possibly less stable. I'm not sure how this trades against the reduced responsiveness that would result from decreasing the loop bandwidth.
During the two August 2018 observing sessions the mount lagged behind some targets pretty badly, therefore marking this issue as a blocker. The tracking loop is second order so in theory there should be no problem maintaining zero steady state position error for constant velocity. Perhaps the loop with the current settings can't handle the acceleration. If that's the case we could either increase the loop bandwidth (which might exacerbate tracking jitter) or try a 3rd order loop. I don't have any experience with 3rd order loops so I'm hesitant to go there--2nd order loops are tricky enough--but if we have to we have to.
Another possibility is that the latency of the Logitech webcam is high enough that the software is tracking the position of the object at some time in the past. I don't think this is the case because I'm pretty sure I could see the object drifting off center in the webcam view. It should be possible to confirm this by looking at optical error telemetry from recent observing sessions.
Additional thoughts:
slew()
method on the mount reports that a limit was enforced. I think this may be overly aggressive and may be interfering with proper loop behavior. A limit on the integrator is still prudent, but perhaps we should instead clamp it to the max supported slew rate every time and not to whatever slew rate the slew()
method returns, which could be much lower than the max due to acceleration limits.During blind tracking testing with only the declination axis today I noted that for many scenarios the mount will slew quickly towards the target, overshoot, and then return in the opposite direction, but then things grind to a near halt and it takes an inordinate amount of time for it to eliminate the last little bit of error. This pretty clearly rules out anything specific to either error source since I was watching printouts of the error term and could plainly see that it was painfully non-zero for an extended period.
I'm also thinking that the current approach to setting the integral and gain terms in the loop filter may not be appropriate, since the math is based on the assumption that the slew rate takes effect instantaneously. This is probably why the mount oscillates when the loop bandwidth is set even moderately high. The actual response characteristics of the mount (max acceleration and latency, for example) may need to be modeled properly.
Alternatively, we could find optimal loop filter coefficients by some kind of optimization algorithm that runs tests with the actual hardware. The cost function would be the elapsed time until the error magnitude falls below some threshold and stays below it for some duration. This process could be time consuming but should only need to be run once for a given hardware configuration and could be automated to a large extent.
I have learned and done a few things since last commenting on this issue.
First, I derived the expected steady-state tracking error for an accelerating target and found that it is far too large at reasonable target accelerations with the loop filter integral term that has been in use for the last couple of years. This almost certainly explains some of the issues we have observed where the mount appears to lag behind the target despite no apparent issues calculating the position error accurately. Thus we need to use a larger integral term while maintaining loop stability.
Second, I was able to confirm in simulation that the acceleration limits that are applied to the mount in software are a major reason the control system has stability problems that occur when the loop bandwidth is increased beyond a certain level. Adding a derivative term to the loop controller helps improve the stability, but also makes the loop noisier. I also determined that tuning the integral and proportional terms independently may be appropriate in this situation because with the acceleration limits the plant (mount) is not a linear system. Experimental tuning using the procedure on this website led to some improvement but there were still some stability issues when the integral term was made as large as desired. Without some other change, this experimentation suggested that the derivative term alone would not be enough to achieve the desired steady state error with target acceleration while maintaining stability.
Most recently, I experimented with the acceleration limit (and the slew rate limit, though I don't think this is a major concern) and found that the existing limits may be more conservative than necessary. I may have set the current limits at a time when we were still dealing with #94 and thought that using a lower acceleration limit would help prevent this. I no longer believe this is necessary, so the next steps are to find a higher but still safe mount acceleration limit, tune the PID gains again, and test the system again.
Higher acceleration limits have been found to be feasible after some significant work in the point package to allow commanding of the slew rates to the mount to happen asynchronously in separate processes such that acceleration is smooth and far less likely to cause motor stalls at higher accelerations. This in turn allowed for increasing the loop bandwidth from 0.5 Hz to roughly 3.5 Hz which is sufficient to meet the goal of achieving a steady state error of 0.01 degrees (36 arcseconds) for a target accelerating at 0.2 deg/s/s in either mount axis. A derivative term was added to the loop controller but this was found to be unnecessary after addressing the acceleration limit problem.
Further testing has shown that the control loop frequency must be reasonably fast--at least 15 Hz, maybe closer to 20 Hz--to achieve the desired level of stability. When the control cycle is too slow, the mount will oscillate several times upon reaching the target position. I suspect that the ratio between the loop update frequency and the loop bandwidth, now 3.5 Hz, is what ultimately matters. With a 15 Hz control loop frequency that ratio is only 4.3.
I created #177 to work on optimizing the code to hopefully speed up the control loop.
When testing optical tracking today, using PID gains set to integral 40 and damping factor sqrt(2)/2, the system did a small magnitude oscillation around the target when otherwise converged. This is despite the fact that the control cycle time was 20-30 Hz depending on what camera exposure time setting I used. This was without the large OTA installed; I only had the guide scope attached and no counterweights, but I would think this should have improved the stability of the system.
After breaking down the setup I wondered if perhaps the fact that the new optical error source allows itself to use a cached encoder position from the mount could be a contributing factor. I'm not certain that using a slightly stale value from the mount is okay, but I think it is since the mount position is only needed to determine how to scale the right ascension axis error based on the declination axis position and this does not need to be exact.
I discovered a technique known as Model Predictive Control that looks extremely promising. Opened #184 to investigate whether this could replace the current PID controller and save us from these stability issues.
Model predictive control has been adopted, which has no parameters to tune. Therefore there is nothing left to do here.
The default loop parameters are pretty conservative and result in a very slow response rate. Experiment with different parameters to see if a stable loop with a wider bandwidth is possible.