Implement Least Square MA

femtotrader commented 4 months ago

This https://pythonnumericalmethods.studentorg.berkeley.edu/notebooks/chapter16.04-Least-Squares-Regression-in-Python.html could help (Least Squares Regression) What kind of dependencies is allowed? Numpy could help for this kind of work

QuantConnect Lean (C#) provides such an indicator https://github.com/QuantConnect/Lean/blob/master/Indicators/LeastSquaresMovingAverage.cs

How such an indicator should work when input data are not equally spaced in time?

femtotrader commented 4 months ago

I have done some tests with QC (Research environment)...

Here is code

CLOSE_TMPL = [10.5, 9.78, 10.46, 10.51, 10.55, 10.72, 10.16, 10.25, 9.4, 9.5, 9.23, 8.5, 8.8, 8.33, 7.53, 7.61, 6.78, 8.6, 9.21, 8.95, 9.22, 9.1, 8.31, 8.37, 8.3, 7.78, 8.05, 8.1, 8.08, 7.49, 7.58, 8.17, 8.83, 8.91, 9.2, 9.76, 9.42, 9.3, 9.32, 9.04, 9.0, 9.33, 9.34, 8.49, 9.21, 10.15, 10.3, 10.59, 10.23, 10.0]
TIME_TMPL = pd.date_range("2020-01-01", freq="D", periods=len(CLOSE_TMPL))
df = pd.DataFrame({"Close": CLOSE_TMPL}, index=TIME_TMPL)
period = 5
indicator = LeastSquaresMovingAverage(period)
indicator.is_ready, indicator.current.time, indicator.current.value
indicator_is_ready = []
indicator_output_values = []
prec = 2
for row in df.iterrows():
    current_time, current_value = row
    current_value = Decimal(current_value["Close"]).quantize(Decimal(10)**-prec)
    indicator.update(current_time, current_value)
    indicator_is_ready.append(indicator.is_ready)
    indicator_output_values.append(indicator.current.value)
    #print(current_time, current_value, indicator.is_ready, indicator.current.value)  # value = Intercept.Current.Value + Slope.Current.Value * Period
df["is_ready"] = indicator_is_ready
df["indicator_output_values"] = indicator_output_values
df

and output

            Close  is_ready  indicator_output_values
2020-01-01  10.50     False                   10.500
2020-01-02   9.78     False                    9.780
2020-01-03  10.46     False                   10.460
2020-01-04  10.51     False                   10.510
2020-01-05  10.55      True                   10.526
2020-01-06  10.72      True                   10.798
2020-01-07  10.16      True                   10.402
2020-01-08  10.25      True                   10.256
2020-01-09   9.40      True                    9.662
2020-01-10   9.50      True                    9.366
2020-01-11   9.23      True                    9.186
2020-01-12   8.50      True                    8.642
2020-01-13   8.80      True                    8.646
2020-01-14   8.33      True                    8.318
2020-01-15   7.53      True                    7.764
2020-01-16   7.61      True                    7.544
2020-01-17   6.78      True                    6.858
2020-01-18   8.60      True                    7.728
2020-01-19   9.21      True                    8.816
2020-01-20   8.95      True                    9.252
2020-01-21   9.22      True                    9.598
2020-01-22   9.10      True                    9.218
2020-01-23   8.31      True                    8.628
2020-01-24   8.37      True                    8.376
...
2020-02-15  10.15      True                    9.606
2020-02-16  10.30      True                   10.214
2020-02-17  10.59      True                   10.806
2020-02-18  10.23      True                   10.592
2020-02-19  10.00      True                   10.180

I can't output separately slope and intercept but only Intercept.Current.Value + Slope.Current.Value * Period but I think it can help to have reference data for tests

This kind of unit tests can also help (in LeastSquaresMovingAverageTest.cs)

        [Test]
        public void TalippCompare_with_period_2()
        {
            decimal[] CLOSE_TMPL = new decimal[]
            {
                10.5m, 9.78m, 10.46m, 10.51m, 10.55m, 10.72m, 10.16m, 10.25m, 9.4m, 9.5m,
                9.23m, 8.5m, 8.8m, 8.33m, 7.53m, 7.61m, 6.78m, 8.6m, 9.21m, 8.95m,
                9.22m, 9.1m, 8.31m, 8.37m, 8.3m, 7.78m, 8.05m, 8.1m, 8.08m, 7.49m,
                7.58m, 8.17m, 8.83m, 8.91m, 9.2m, 9.76m, 9.42m, 9.3m, 9.32m, 9.04m,
                9.0m, 9.33m, 9.34m, 8.49m, 9.21m, 10.15m, 10.3m, 10.59m, 10.23m, 10.0m
            };
            DateTime[] DATE_TMPL = Enumerable.Range(0, CLOSE_TMPL.Length)
                                         .Select(i => new DateTime(2024, 7, 7).AddDays(i))
                                         .ToArray();
            decimal[] expected_slope = new decimal[]
            {
                0m, -0.720m, 0.680m, 0.050m, 0.040m, 0.170m, -0.560m, 0.090m, -0.85m, 0.100m,
                -0.270m, -0.730m, 0.300m, -0.470m, -0.800m, 0.080m, -0.830m, 1.820m, 0.610m, -0.260m,
                0.270m, -0.120m, -0.790m, 0.060m, -0.070m, -0.520m, 0.270m, 0.050m, -0.020m, -0.590m,
                0.090m, 0.590m, 0.660m, 0.080m, 0.290m, 0.560m, -0.340m, -0.120m, 0.020m, -0.280m,
                -0.040m, 0.330m, 0.0100m, -0.850m, 0.720m, 0.940m, 0.150m, 0.290m, -0.360m, -0.230m
            };
            decimal[] expected_intercept = new decimal[]
            {
                0.000m, 11.220m, 9.100m, 10.410m, 10.470m, 10.380m, 11.280m, 10.070m, 11.100m, 9.300m,
                9.770m, 9.960m, 8.200m, 9.270m, 9.130m, 7.450m, 8.440m, 4.960m, 7.990m, 9.470m,
                8.680m, 9.340m, 9.890m, 8.250m, 8.440m, 8.820m, 7.510m, 8.000m, 8.120m, 8.670m,
                7.400m, 6.990m, 7.510m, 8.750m, 8.620m, 8.640m, 10.100m, 9.540m, 9.280m, 9.600m,
                9.080m, 8.670m, 9.320m, 10.190m, 7.770m, 8.270m, 10.000m, 10.010m, 10.950m, 10.460m
            };
            var indicator = new LeastSquaresMovingAverage(2);
            for (int i=0; i<CLOSE_TMPL.Length; i++)
            {
                indicator.Update(DATE_TMPL[i], CLOSE_TMPL[i]);
                Assert.AreEqual(Math.Round((decimal)indicator.Slope.Current.Value, 4), expected_slope[i]);
                Assert.AreEqual(indicator.Intercept.Current.Value, expected_intercept[i]);
                Assert.AreEqual(Math.Round((decimal)indicator.Current.Value, 4), CLOSE_TMPL[i]);
            }
        }

        [Test]
        public void TalippCompare_with_period_5()
        {
            decimal[] CLOSE_TMPL = new decimal[]
            {
                10.5m, 9.78m, 10.46m, 10.51m, 10.55m, 10.72m, 10.16m, 10.25m, 9.4m, 9.5m,
                9.23m, 8.5m, 8.8m, 8.33m, 7.53m, 7.61m, 6.78m, 8.6m, 9.21m, 8.95m,
                9.22m, 9.1m, 8.31m, 8.37m, 8.3m, 7.78m, 8.05m, 8.1m, 8.08m, 7.49m,
                7.58m, 8.17m, 8.83m, 8.91m, 9.2m, 9.76m, 9.42m, 9.3m, 9.32m, 9.04m,
                9.0m, 9.33m, 9.34m, 8.49m, 9.21m, 10.15m, 10.3m, 10.59m, 10.23m, 10.0m
            };
            DateTime[] DATE_TMPL = Enumerable.Range(0, CLOSE_TMPL.Length)
                                         .Select(i => new DateTime(2024, 7, 7).AddDays(i))
                                         .ToArray();
            //decimal[] expected_zeros = Enumerable.Repeat(0m, 50).ToArray();
            decimal[] expected_slope = new decimal[]
            {
                0.000m, 0.000m, 0.000m, 0.000m, 0.083m, 0.197m, -0.039m, -0.091m, -0.277m, -0.320m,
                -0.261m, -0.367m, -0.220m, -0.277m, -0.357m, -0.305m, -0.476m, -0.021m, 0.435m, 0.511m,
                0.523m, 0.101m, -0.165m, -0.207m, -0.257m, -0.265m, -0.111m, -0.079m, -0.012m, -0.055m,
                -0.155m, -0.036m, 0.218m, 0.409m, 0.398m, 0.355m, 0.203m, 0.100m, -0.022m, -0.154m,
                -0.110m, -0.026m, 0.033m, -0.076m, -0.042m, 0.151m, 0.358m, 0.529m, 0.248m, -0.037m
            };
            decimal[] expected_intercept = new decimal[]
            {
                0.000m, 0.000m, 0.000m, 0.000m, 10.111m, 9.813m, 10.597m, 10.711m, 11.047m, 10.966m,
                10.491m, 10.477m, 9.746m, 9.703m, 9.549m, 9.069m, 9.238m, 7.833m, 6.641m, 6.697m,
                6.983m, 8.713m, 9.453m, 9.411m, 9.431m, 9.167m, 8.495m, 8.357m, 8.098m, 8.065m,
                8.325m, 7.992m, 7.376m, 6.969m, 7.344m, 7.909m, 8.615m, 9.018m, 9.466m, 9.830m,
                9.546m, 9.276m, 9.107m, 9.268m, 9.200m, 8.851m, 8.424m, 8.161m, 9.352m, 10.365m
            };
            decimal[] expected_pred = new decimal[]
            {
                10.5m, 9.78m, 10.46m, 10.51m, 10.526m, 10.798m, 10.402m, 10.256m, 9.662m, 9.366m,
                9.186m, 8.642m, 8.646m, 8.318m, 7.764m, 7.544m, 6.858m, 7.728m, 8.816m, 9.252m,
                9.598m, 9.218m, 8.628m, 8.376m, 8.146m, 7.842m, 7.940m, 7.962m, 8.038m, 7.790m,
                7.550m, 7.812m, 8.466m, 9.014m, 9.334m, 9.684m, 9.630m, 9.518m, 9.356m, 9.060m,
                8.996m, 9.146m, 9.272m, 8.888m, 8.990m, 9.606m, 10.214m, 10.806m, 10.592m, 10.180m
            };
            var indicator = new LeastSquaresMovingAverage(5);
            for (int i = 0; i < CLOSE_TMPL.Length; i++)
            {
                indicator.Update(DATE_TMPL[i], CLOSE_TMPL[i]);
                Assert.AreEqual(Math.Round((decimal)indicator.Slope.Current.Value, 4), expected_slope[i]);
                Assert.AreEqual(indicator.Intercept.Current.Value, expected_intercept[i]);
                Assert.AreEqual(Math.Round((decimal)indicator.Current.Value, 4), expected_pred[i]);
            }
        }

femtotrader commented 4 months ago

https://github.com/nardew/talipp/pull/149 is a draft PR to showcase a possible solution using Numpy.

Reference data for unit tests come from QuantConnect Lean but I'm not 100% sure that it's good enough.

I think this implementation works correctly only with regularly time spaced incoming data (which is the case here) but it shouldn't behave correcly when data are irrregularly time spaced.

femtotrader commented 4 months ago

I wonder if LSMA is different from Time Series Forecast (TSF) implemented in Tulip https://tulipindicators.org/tsf https://github.com/TulipCharts/tulipindicators/blob/master/indicators/tsf.c or in C# https://github.com/QuantConnect/Lean/pull/7654

nardew commented 3 months ago

Thanks a lot for this! Unfortunately I cannot spend much time with the library but I am looking forward to look into this as soon as possible.

femtotrader commented 3 months ago

I wonder how TSF https://tulipindicators.org/tsf is different from LSMA

nardew / talipp

Implement Least Square MA #23