indilib / indi-3rdparty

INDI 3rd Party drivers repository
https://www.indilib.org/devices.html
GNU Lesser General Public License v2.1
124 stars 208 forks source link

(CAUX) Implement predictive tracking with PID corrections in AltAz mode. #960

Closed jochym closed 3 weeks ago

jochym commented 1 month ago

This is a first attempt to implement tracking in AltAz mode with calculated track rates and PID controlled fine-tuning. The most of the tracking is handled by Horiz<->Eq transformation. The PID is used only to correct the small drift, probably caused by from the skew in clock rates (to be verified). The test results: show RMS tracking noise below 0.1 arcsec for Alt < 80 deg. This is essentially LSB of the hardware.

knro commented 1 month ago

Thank you Pawel, that's amazing! So can this be shared among Alt-Az mounts? is there anything specific to Celestron in the implementation changes?

Also please remove your gphoto changes.

jochym commented 1 month ago

@knro gphoto removed.

I think this is completely universal. There is nothing specific to celestron driver, except that the driver/variables/structure are probably different. But the algorithm is completely universal and should be easily portable to other alt/az drivers which can directly control tracking rates and report "encoder" positions (they not need to be real encoders - they could be fake like in the case of NS Evo). Only one exception may be coefficients defining rates in steps and arsec / s - but this is actually done outside the tracking loop.The algorithm is essentially:

  1. Calculate positions (Alt/Az) at two moments: now, now+dt (10s in this case - maybe we should define it as a constant).
  2. Calculate delta Az, delta Alt;
  3. Take care of any boundary crossings (az in particular)
  4. Calculate rates : delta angle/dt
  5. calculate current offsets (errors) from the pos(now) - target(now)
  6. Feed errors to PID to generate corrective term on top of predicted rates
  7. add both rates and send to the scope

You can clearly see this is not complicated at all and quite universal. My hypothesis is the small Alt/Az drift (few arcsec/minute) from predicted rates originate from the skew in clocks between PC, mount and real time and inaccuracies in Lat/Lon and other parameters. We can probably remove some of it, but I doubt we will get all of them - so it is better to just handle all of it with small correction from the PID.

knro commented 1 month ago

Thank you for the detailed explanation! Maybe this is a good chance to extract the algorithm to a common file (or in INDI::Telescope) so that it can be utilized by any mount? In particular, I'd like to test AZ-GTi performance

jochym commented 1 month ago

Maybe factoring out is a good idea for the future evolution of this code. How many "direct motor control" drivers we have? In general, they should be essentially identical, except for the communication and hardware details. The tracking and general operation should be the same. The extraction of general "direct motor control" driver into base INDI driver set will probably massively simplify the development of such drivers - limiting it to the implementation of actual comm protocols and handling of the hardware. E.g. EQmod driver seems to be much more low-level, but in general this low level stuff can be relegated to this "hardware handling" block.

As for the performance of current version: I will be extremely interested in multiple tests on different hardware.

BTW I think the same drift issues will be present in EQ mode (if the time skew is really the reason) - maybe we should use the same PID correction scheme to handle this. But this is for the future. Now let us deliver the driver to the testers and solve inevitable issues ;)

jochym commented 4 weeks ago

I am running a long time test now. I will report on the results later. The driver tracks nicely for 24 hours without any glitch (over serial). I am just neat picking ;). One thing is quite clear and easy to understand. There is a clear and consistent drift in both axis due to the clock skew between PC (ntp stabilized) and the mount (quartz derived). In my case, it is approx 10.8 s/24h (6.8 arcsec/h). It is probably different in each mount and, probably, will change with temperature. We can correct for this with a parameter or let PID handle it (we do now). But this has an impact on Eq mode as well. The mount will drift 7arcsec/h in RA in Eq. mode. This drift is a minor component. There is also a 250 arcsec/rev component in Az and 150 arcsec/rev (+71 arcsec backlash) Al component. Periodic in both axis. It seems I can model both with simple function. The problem is, I do not understand the function: both seem to be trigonometric functions (az: derivative of arctan = 1/(x^2 +1) ; al: arctan) acting on time (HA or Az probably). But these functions act on real (-oo, oo) numbers, not angles !! And they are not periodic! This is strange, and I do not understand it. It may be connected with the alignment subsystem (I am running with the nearest plugin). I need to run the same test without alignment. Also, probably verify it with simulator (testing the clock skew hypothesis). Furthermore, my model may fit just by accident. The functions may be quite different (in fact, I think they are, they need to be periodic in HA or Az). All this is just attempts to understand the whole system better - and maybe improve it in the future. It has really no impact on the current driver. It is very stable and all these residuals are removed by the PID. So there is really no barrier for merging the current version of the patch (from my point of view, at least).

jochym commented 4 weeks ago

@knro The offsets and model of M31 tracked over 24h is here:

m31_track

Do not pay too much attention to the model. This is just an attempt to understand the system. Only Alt and Az lines are really important (top graph, blue and orange). The good news is it is very stable over long periods (over 30h now) and very smooth (except for backlash, which we can detect by watching for the sign change in predicted rates).

This was run with alignment active. I am now running without alignment system activated. I think this is not as simple as I thought at the start. Thus, I think we will be better off using PID to correct this variable offsets and concentrating on solving the issue of backlash in Alt and Az (object between zenith and celestial pole change direction in Az). The clock skew issue can be solved by calibration of the actual movement in both axis (run both axis with constant rate for a few minutes and compare the results with predicted values = dt*rate - this will give a mount constant). It could be incorporated into the driver - but to tell the truth I see no point, except for Eq mode, we need to use PID anyway so wy bother. I will report the results of the second test probably tomorrow.

jochym commented 4 weeks ago

Another trace. This time it is delta Cas which transits at 80 degs on the northern side of the zenith. Thus, we have Az axis change tracking direction and of course we have the same backlash effect. This track was executed with alignment system deactivated. This shows that the deviations are not connected to alignment. The model shows that diversion is probably some simple function of Alt/Az/HA. I would really like to know what they are, but I do not think it is worth the effort to find it out. It is 300 arsec max amplitude which is easily canceled with PID (or guiding), which we need to use anyway.

This concludes my safety tests - all corner cases I was able to come up with.

navi_tracking

jochym commented 3 weeks ago

It came to me that what we do is, essentially, numerical function integration (the mount is the integrator). The formula we use is the first order, forward difference approximation of derivative. This of course leads to accumulation of errors. Since we change speed in the way symmetrical around meridian, the errors will change sign and may cancel after full rotation. But these are tiny errors, and the maximum accumulated error of the sum is ~1/1000 after full rotation. The forward difference error is $\sim o(\Delta t)$, this is the simplest formula:

$$ \frac{d f(x)}{dx} = \lim_{h \rightarrow 0} \frac{f(x+h) - f(x)}{h} \approx \frac{f(x+h) - f(x)}{h} + o(h). $$

We can, of course, use a higher order formula. The first we should try is a central difference:

$$ \frac{d f(x)}{dx} \approx \frac{f(x+h) - f(x-h)}{2h} + o(h^2), $$

which has errors quadratic in time step. This should do the trick if my hypothesis is correct. The price is really moderate: either two calls to RADec->AltAz function, or a little complication of the timer function to use previously calculated positions and not waste cycles. We can also reduce the error by changing the time step to a smaller value - risking hitting the numerical accuracy barrier (this is real - numerical derivatives are tricky that way, see: https://en.wikipedia.org/wiki/Numerical_differentiation). Or we can do nothing and let the PID handle this. ;)

I am tempted to try this. But in practice, the PID approach seems better/easier/simpler. I doubt we will get rid of all errors (e.g. clock skew will still be there and PID is really the bast way to correct it - all other methods are more complicated and troublesome for the user). Now, when I understand the cause, I will probably resist the temptation and concentrate on improving handling of the backlash. ;)

What do you think @knro ?

jochym commented 3 weeks ago

Since it was so easy, I just changed the formula to central difference. For the 5s time step, we have blind tracking drift on my mount around 1.5 arcsec/h. This looks good. I will post the long-time plot later.

knro commented 3 weeks ago

Wow that's impressive! I had to read your comment a few times to wrap my head around it. So in essence the 2nd order formula would be sufficient to adjust the tracking rates and minimizing errors without relying on PID, did I understand this correctly?

jochym commented 3 weeks ago

Yes, you got it right. With the second order formula (central difference), and without the PID the total drift with my mount is of the order of 2arcsec/hour. This is enough even for short exposures (1-3 minutes, depending on the sky area, I guess) and completely satisfactory for any visual work. I would still keep the PID for cases where we want to remove all drift, but as it looks now I would switch it off in default config (zero coeffs). Here is the plot of the tracking over the last two hours ($\theta$ Dra, from Cracow, Poland, lon:20 lat:50, approx from NE to SE).

central

One advantage of this (blind) mode of tracking is that we are immune (almost) to any communication problems (lost packets, timing problems etc.), since there is a much weaker feedback loop between the mount and the driver. So the backlash oscillations, Wi-Fi communication spikes etc. do not disturb the tracking. We can always implement "Zero the error" button/function, which will use PID to reset the tracking to the central line by engaging the corrections for a short period (10-30s should be enough). We can even do it periodically when cumulated error exceeds certain threshold. All this will be fairly easy to do on top of the new code. The cost is really slim: two additional RADec->AltAz conversion and few additional variables. I think it is worth it.

Shall I add the modification to the PR?

knro commented 3 weeks ago

That's awesome, please go ahead!

The question now is how we can move this functionality outside the class so that other mount drivers can perhaps utilize it? Right now, I can think about SkyWatcherAltAz driver, but perhaps we'll have more later on.

jochym commented 3 weeks ago

It is possible, and even easy, to factor it out in the form of CalculateTrackingRates function. It will need the same info as TransformCelestialToTelescope and AltitudeAzimuthFromTelescopeDirectionVector plus the time and time step. The rest is actually the same tracking loop as before. Is the tracking code in SkyWatcher similar in any way to the one in AUX driver?

The code is in.

knro commented 3 weeks ago

Yes they're very similar. At any rate, let's keep this as is, and I'll test it on SkyWatcher and let you know how it goes.

This is quite outstanding work @jochym as nobody was able to solve the tracking issues for several years!

jochym commented 3 weeks ago

As for specificity to AUX mounts. The only thing I can think of are units of the tracking rate. As is the positions/offsets are in encoder units (2^24 per revolution or STEPS_PER_DEGREE) while tracking rates are in motor controller (AUX protocol) units 1024arcsec/s (79.1015625... in steps). The last number is kind of crucial because without knowing it exactly* we would be unable to translate angular rates to commands for the scope. Here I suspect may be some problem with porting this to other drivers - but since it is well-defined and localized it should be manageable.

Please update the debian packaging file to properly reflect the changed version of the driver. I bumped the version to 1.4 in the driver itself.

jochym commented 3 weeks ago

Oh, one more thing. I forgot. Would you change the defaults in PID? The current values are completely inappropriate. Depending on what you decide: either switch it off completely (so all zeros) or set it to P=30 I=0 D=0 for both axis (this results in 5s settling time for the mount, P=20 is roughly 10s and more stable). I would vote for switching off - the simpler, the better. This also eliminates the backlash oscillations problem - so double vote. Without PID it handles backlash like this:

Zrzut ekranu z 2024-08-21 16-09-09

Clean jump in offset done by the mount. If your backlash settings in the mount are correct, the cancelling will be done by the mount (maybe we should add settings for this into the panel?). We can even detect the jump and signal to the camera driver that the exp should be interrupted because the mount has moved. Since this is mechanics, we cannot expect the mount to cancel the backlash to the fraction of the arcsec. Thus, it is better to cut the exposure than ruin it.

jochym commented 3 weeks ago

Worst case scenario. I found the star passing 8' from zenith and tracked it. Obviously, the errors were large but controllable. With 1.25 deg/s azimuth tracking rate, the maximum error in azimuth was 0.8 deg - which I would consider very good (it is offset in Az at 88 deg 52' while tracking 1deg/s - the difference in position is negligible and corresponds to 0.5s of time difference). This is a ridiculous job for AltAz mount. As soon as the Alt got below 88 deg, the offset fell below 50 arcsec.

zenith-8arcmin

This was run out of curiosity and to test the limits. One would think that PID would help in this case, but in the current configuration of the limits of the corrective terms it actually harmed the recovery - it was too slow and weak to correct the prediction but generated long-lasting overshot. We may consider increasing these limits if we decide to make the driver capable of tracking fast moving objects like satellites. For now, I would leave it as is.