brandtg / stl-java

A Java implementation of STL
Apache License 2.0
21 stars 17 forks source link

Apply weighted mean smoothing if periodic, after outer loop #2

Closed brandtg closed 9 years ago

brandtg commented 9 years ago

In the R code, if periodic is true, then weighted mean smoothing is applied to the cycle subseries of the seasonal component:

    if (periodic) {
        which.cycle <- cycle(x)
        z$seasonal <- tapply(z$seasonal, which.cycle, mean)[which.cycle]
    }
    remainder <- as.vector(x) - z$seasonal - z$trend

This corresponds to "2.5 Post-smoothing of the seasonal" from the paper.

The CycleSubSeries inner class of StlDecomposition and StlDecomposition#weightedMeanSmooth functions may be of use here.

There's a comment in StlDecomposition#decompose that shows where this should go, around line 145.

hntd187 commented 9 years ago

So what this seems to be to me I prototyped around with your test data in R to try and understand it. So the cycle creates basically an index for the value in each time period like

Array(1,2,3,4,5,6,7,1,2,3,4,5,6,7);

So on and so forth, then the using that it uses that for the mean to smooth.

So for a test dataset

Array(1,2,3,4,5,6,7,7,6,5,4,3,2,1); The first one would take the mean of 1 and 7 being 4, so the seasonal would then be scaled based on that mean, and the remainder re calculated as a result.

I'm hacking around with this right now, I should have a rough POC for what I mean sometime tomorrow, it'll be rough, not going to lie, but we can refine it once I have it working.

brandtg commented 9 years ago

I'm not sure I completely follow the example, but from reading the paper, it seems like the idea is that although we get smoothing among cycle-subseries, when recombining the cycle-subseries into the full time series, we may have discontinuities between adjacent cycle-subseries components. For example, if seasonal is year, we might have roughness from day to day.

So in the example, it seems like at this step, instead of considering 1 and 7 (which we did when decomposing into cycle-subseries), we want to consider the period 1,2,3,4,5,6,7, and apply smoothing within this. E.g. getting quarterly means for presidents, at bottom of tapply doc (https://stat.ethz.ch/R-manual/R-devel/library/base/html/tapply.html), which is similar to:

if (periodic) {
    which.cycle <- cycle(x)
    z$seasonal <- tapply(z$seasonal, which.cycle, mean)[which.cycle]
}

The actual smoothing seems to be up to the implementation. The paper says to use Loess with "locally-quadratic fitting because there are so many peaks and valleys in the seasonal component", whereas it looks like the R implementation uses mean smoothing.

It's probably better IMO to go with the R approach, as it would be more familiar to users, and allow us to compare with real output data.

Does this sound right?

hntd187 commented 9 years ago

Yes, that is along the lines of what I was attempting to explain, sorry for being unclear. I myself read it back and asked myself what I was talking about. Either way, the R implementation to the best of my knowledge uses either mean smoothing for periodic or loess for non-periodic. The thing that confuses me about this portion is under what period do they smooth? For the mean smoothing it looks like they smooth along the number of observations in the period so for our test data we would smooth yearly. This is because the output of the cycle command is what I described above

final int[] cycle = new int[]{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};

So I'm working on smoothing like that based on the period after the robustness and inner loops, I had something almost working before leaving work today, I should have a rough idea finished tomorrow, but the numbers still seemed a bit off so maybe I'm wrong. I think we're on the same page as to what we have to do, but the actual implementation in R is kinda vague still even about this part, so it seems a question of if we are doing it properly. I'd expect some rounding difference among the 2 implementations, but mine is getting rounding errors in the hundreds and thousands which I assume is not correct.

brandtg commented 9 years ago

That makes sense to me.

If periodic is provided in the R function, is mean smoothing or loess used in the loops? If mean smoothing is used there as well as after the loops, that may be a reason for the difference.

hntd187 commented 9 years ago

Now here is another thing I'm confused about in regards to this. https://stat.ethz.ch/R-manual/R-devel/library/stats/html/stl.html

The documentation talks about the method here in some ultra vague terms, but what's important here is that at the bottom under See Also, it says "plot.stl for stl methods; loess in package stats (which is not actually used in stl)." I think this is saying that this specific function is not used in stl, but loess is still used in the actual implementation, but reading it again it seems to appear to me that it's the following.

Periodic Mean smoothing of the sub series Mean smoothing of the overall seasonal afterwards

Non-Periodic Loess smoothing of the sub series No overall seasonal smoothing afterwards

Does this seem right? If we're on the same page, I think I should have some turn around on these methods by the end of the day.

hntd187 commented 9 years ago

Okay so I tried this out and I think something is wrong with the way we're doing it.

So your weightedmean smoothing

private double[] weightedMeanSmooth(double[] series, double[] weights) {
    double[] smoothed = new double[series.length];
    double mean = 0;
    double sumOfWeights = 0;
    for (int i = 0; i < series.length; i++) {
      double weight = (weights != null) ? weights[i] : 1; // equal weights if
                                                          // none specified
      mean += weight * series[i];
      sumOfWeights += weight;
    }
    // TODO: This is a hack to not have NaN values
    if (sumOfWeights == 0) {
      for (int i = 0; i < series.length; i++) {
        smoothed[i] = series[i];
      }
    } else {
      mean /= sumOfWeights;
      for (int i = 0; i < series.length; i++) {
        smoothed[i] = mean;
      }
    }
    return smoothed;
  }

and mine ...

  private double[] periodicSmoothing(double[] seasonal, int observations){
    double[] smoothedSeasonal = new double[seasonal.length];
    List<double[]> seasonalSubSeries = new ArrayList<double[]>();
    for(int i = 0; i < seasonal.length; i += observations){
      int end;
      if (i + observations > seasonal.length){
        end = seasonal.length;
      }else{
        end = i + observations;
      }
      seasonalSubSeries.add(Arrays.copyOfRange(seasonal, i, end));
    }
    for(int i = 0; i < seasonalSubSeries.size(); i++){
      seasonalSubSeries.set(i, mean(seasonalSubSeries.get(i)));
    }
    return smoothedSeasonal;
  }

They produce the same output...

[0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415]
[0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415, 0.004305824576122415]

But when they are applied to the seasonal, they do not produce correct results. they're kind of close, but they aren't the same as the R version.

Components
         seasonal    trend    remainder
Jan  1 -9.3861026 49.93672  0.049382265
Feb  1 -9.6661427 49.82122  0.644917683
Mar  1 -6.9393442 49.70573  1.633614509
Apr  1 -2.7986740 49.62807 -0.129398706
May  1  3.4073341 49.55042  1.142250128
Jun  1  9.0507920 49.50588 -0.056672906
Jul  1 12.8484910 49.46135 -4.609837046
Aug  1 11.6057897 49.41058 -4.616367855
Sep  1  7.4204835 49.35981 -2.480293842
Oct  1  0.4193803 49.35919  0.721427768
Nov  1 -6.4623380 49.35857  0.003764365
Dec  1 -9.4996690 49.56374 -0.264073479
Jan  2 -9.3861026 49.76891  3.817191150
Feb  2 -9.6661427 50.00456 -0.538413350
Mar  2 -6.9393442 50.24020  1.799143557
Apr  2 -2.7986740 50.31155 -0.512879879
May  2  3.4073341 50.38291  0.309758734
Jun  2  9.0507920 50.27720 -0.627991282
Jul  2 12.8484910 50.17149  3.280017596
Aug  2 11.6057897 49.90771 -1.613495917
Sep  2  7.4204835 49.64392 -0.064404607
Oct  2  0.4193803 49.34640  4.434222795
Nov  2 -6.4623380 49.04887 -2.886534816
Dec  2 -9.4996690 48.73136  3.568313970
Jan  3 -9.3861026 48.41384 -1.527734771
Feb  3 -9.6661427 48.06903  0.297117162
Mar  3 -6.9393442 47.72421 -1.284869497
Apr  3 -2.7986740 47.50567 -2.606991207
May  3  3.4073341 47.28712  5.005549132
Jun  3  9.0507920 47.32458  1.424629419
Jul  3 12.8484910 47.36204 -3.410531401
Aug  3 11.6057897 47.55089 -4.856678328
Sep  3  7.4204835 47.73974 -0.860220432
Oct  3  0.4193803 47.90519 -1.224573414
Nov  3 -6.4623380 48.07065  0.191688592
Dec  3 -9.4996690 48.19412  3.005544893
Jan  4 -9.3861026 48.31760  2.868503667
Feb  4 -9.6661427 48.41353  1.352614425
Mar  4 -6.9393442 48.50946  1.329886593
Apr  4 -2.7986740 48.44011  0.158564871
May  4  3.4073341 48.37076 -2.578094801
Jun  4  9.0507920 48.15219 -4.502980157
Jul  4 12.8484910 47.93362  3.417893380
Aug  4 11.6057897 47.72989  0.264319545
Sep  4  7.4204835 47.52617 -0.546649468
Oct  4  0.4193803 47.49211  1.288509329
Nov  4 -6.4623380 47.45805 -4.695716887
Dec  4 -9.4996690 47.52079 -0.421120614
Jan  5 -9.3861026 47.58352  1.102578132
Feb  5 -9.6661427 47.63996 -0.473813462
Mar  5 -6.9393442 47.69639 -2.457043649
Apr  5 -2.7986740 47.87309  0.425580327
May  5  3.4073341 48.04980  1.742866350
Jun  5  9.0507920 48.33410  0.315107399
Jul  5 12.8484910 48.61840 -0.666892658
Aug  5 11.6057897 48.82791 -2.233698878
Sep  5  7.4204835 49.03742 -0.057900275
Oct  5  0.4193803 49.15661  0.224011195
Nov  5 -6.4623380 49.27580  1.586537651
Dec  5 -9.4996690 49.38969  3.709982712
Jan  6 -9.3861026 49.50357 -0.117469754
Feb  6 -9.6661427 49.55832  0.607823729
Mar  6 -6.9393442 49.61307 -1.873721380
Apr  6 -2.7986740 49.45738 -1.558710217
May  6  3.4073341 49.30170  1.090962994
Jun  6  9.0507920 49.07413  1.275073880
Jul  6 12.8484910 48.84657  1.804943660
Aug  6 11.6057897 48.81995  0.574261482
Sep  6  7.4204835 48.79333 -3.213815873
Oct  6  0.4193803 48.82453  0.756087764
Nov  6 -6.4623380 48.85573 -4.293393612
Dec  6 -9.4996690 48.81857 -3.018901983
Jan  7 -9.3861026 48.78141 -0.195307880
Feb  7 -9.6661427 48.82965  4.236496382
Mar  7 -6.9393442 48.87788  1.461462051
Apr  7 -2.7986740 48.99277  2.705907837
May  7  3.4073341 49.10765 -1.914984328
Jun  7  9.0507920 49.12810 -1.378891531
Jul  7 12.8484910 49.14855  0.502960161
Aug  7 11.6057897 49.08610  1.308107515
Sep  7  7.4204835 49.02366  1.055859691
Oct  7  0.4193803 48.99167 -2.711052679
Nov  7 -6.4623380 48.95969 -0.897350062
Dec  7 -9.4996690 48.89459  0.405077825
Jan  8 -9.3861026 48.82949 -0.043391814
Feb  8 -9.6661427 48.72961 -0.563463816
Mar  8 -6.9393442 48.62972  3.609625591
Apr  8 -2.7986740 48.59456  1.304114436
May  8  3.4073341 48.55940 -0.266734669
Jun  8  9.0507920 48.53544 -2.586236345
Jul  8 12.8484910 48.51149 -0.959979127
Aug  8 11.6057897 48.52942  0.364787440
Sep  8  7.4204835 48.54736 -1.267841171
Oct  8  0.4193803 48.60487  1.275752051
Nov  8 -6.4623380 48.66238  0.099960261
Dec  8 -9.4996690 48.73550 -4.035835350
Jan  9 -9.3861026 48.80863  1.377471511
Feb  9 -9.6661427 48.85097  1.915170160
Mar  9 -6.9393442 48.89331  0.846030217
Apr  9 -2.7986740 48.90422  1.194458154
May  9  3.4073341 48.91512 -1.422451859
Jun  9  9.0507920 48.78885 -1.439641699
Jul  9 12.8484910 48.66258  0.688927356
Aug  9 11.6057897 48.41240  0.481806493
Sep  9  7.4204835 48.16223 -0.182709547
Oct  9  0.4193803 48.02414  1.756481662
Nov  9 -6.4623380 47.88605  1.576287859
Dec  9 -9.4996690 47.88943 -1.089759075
Jan 10 -9.3861026 47.89281 -3.706703535
Feb 10 -9.6661427 47.94094 -6.974798332
Mar 10 -6.9393442 47.98908 -0.049731721
Apr 10 -2.7986740 48.11529 -1.416620603
May 10  3.4073341 48.24151  1.451152564
Jun 10  9.0507920 48.51402 -0.664814834
Jul 10 12.8484910 48.78653  0.864976662
Aug 10 11.6057897 49.01896 -0.324749473
Sep 10  7.4204835 49.25139  3.128129214
Oct 10  0.4193803 49.32324 -0.542616791
Nov 10 -6.4623380 49.39509 -0.032747808
Dec 10 -9.4996690 49.39281  2.006861553
Jan 11 -9.3861026 49.39053  1.595573387
Feb 11 -9.6661427 49.36068 -2.594534174
Mar 11 -6.9393442 49.33082 -1.191480326
Apr 11 -2.7986740 49.29139  0.407285321
May 11  3.4073341 49.25195 -1.459286983
Jun 11  9.0507920 49.17506  2.174150841
Jul 11 12.8484910 49.09816 -1.846652440
Aug 11 11.6057897 48.99391  1.000302665
Sep 11  7.4204835 48.88965  0.689862592
Oct 11  0.4193803 48.82120  1.659419871
Nov 11 -6.4623380 48.75275  0.709592137
Dec 11 -9.4996690 48.68084 -0.381173068
Jan 12 -9.3861026 48.60894 -2.122835800
Feb 12 -9.6661427 48.45149 -0.385350122
Mar 12 -6.9393442 48.29405 -2.954703036
Apr 12 -2.7986740 48.17523  1.123444348
May 12  3.4073341 48.05641  2.036253780
Jun 12  9.0507920 48.15376  1.195449975
Jul 12 12.8484910 48.25110 -0.499594936
Aug 12 11.6057897 48.35883 -1.764617040
Sep 12  7.4204835 48.46655 -2.087034321
Oct 12  0.4193803 48.42958 -2.248960029
Nov 12 -6.4623380 48.39261  3.569729251
Dec 12 -9.4996690 48.38903  1.710638618
Jan 13 -9.3861026 48.38545  3.400650458
Feb 13 -9.6661427 48.50860 -0.442460756
Mar 13 -6.9393442 48.63175 -1.392410562
Apr 13 -2.7986740 48.68931 -1.290632113
May 13  3.4073341 48.74686 -1.254191616
Jun 13  9.0507920 48.74173 -0.792525441
Jul 13 12.8484910 48.73661  0.514899628
Aug 13 11.6057897 48.86357  3.030638306
Sep 13  7.4204835 48.99053 -0.111018194
Oct 13  0.4193803 49.23975 -2.359128591
Nov 13 -6.4623380 49.48896  0.573376000
Dec 13 -9.4996690 49.75846  1.541206942
Jan 14 -9.3861026 50.02796 -4.441859642
Feb 14 -9.6661427 50.28863 -1.322483063
Mar 14 -6.9393442 50.54929  0.890054924
Apr 14 -2.7986740 50.71264  0.786031382
May 14  3.4073341 50.87600 -0.083330111
Jun 14  9.0507920 50.82199  0.927214904
Jul 14 12.8484910 50.76799  1.883518814
Aug 14 11.6057897 50.55721  2.736996399
Sep 14  7.4204835 50.34644  2.333078806
Oct 14  0.4193803 50.12288 -0.342256860
Nov 14 -6.4623380 49.89932 -1.336977540
Dec 14 -9.4996690 49.72208 -4.422410657
Jan 15 -9.3861026 49.54484 -0.758741301
Feb 15 -9.6661427 49.47481 -1.608668789
Mar 15 -6.9393442 49.40478 -2.065434869
Apr 15 -2.7986740 49.52709  0.171585593
May 15  3.4073341 49.64940  0.343268104
Jun 15  9.0507920 49.91322  0.635990769
Jul 15 12.8484910 50.17704  3.474472327
Aug 15 11.6057897 50.39003 -1.595820897
Sep 15  7.4204835 50.60303  1.176490701
Oct 15  0.4193803 50.65975  0.120870312
Nov 15 -6.4623380 50.71647 -1.454135089
Dec 15 -9.4996690 50.69741  4.602263773
Jan 16 -9.3861026 50.67834 -1.292234891
Feb 16 -9.6661427 50.65289  1.613257560
Mar 16 -6.9393442 50.62743 -0.188088581
Apr 16 -2.7986740 50.52838 -0.629701859
May 16  3.4073341 50.42932 -3.836653088
Jun 16  9.0507920 50.17694  1.272271864
Jul 16 12.8484910 49.92455  1.826955710
Aug 16 11.6057897 49.63471  2.759497900
Sep 16  7.4204835 49.34487  0.034644912
Oct 16  0.4193803 49.13328 -0.952657340
Nov 16 -6.4623380 48.92168  1.740655396
Dec 16 -9.4996690 48.74207 -2.842396795
Jan 17 -9.3861026 48.56245 -1.876346513
Feb 17 -9.6661427 48.45198 -3.785832633
Mar 17 -6.9393442 48.34150  2.597842654
Apr 17 -2.7986740 48.39278 -1.694109063
May 17  3.4073341 48.44406  0.848601268
Jun 17  9.0507920 48.63807  0.911136572
Jul 17 12.8484910 48.83208 -1.680569229
Aug 17 11.6057897 49.02535  0.468864150
Sep 17  7.4204835 49.21861  1.460902352
Oct 17  0.4193803 49.32807 -0.147451108
Nov 17 -6.4623380 49.43753 -1.375189581
Dec 17 -9.4996690 49.50077  1.298903955
Jan 18 -9.3861026 49.56400  0.622099965
Feb 18 -9.6661427 49.59113  1.075012028
Mar 18 -6.9393442 49.61826 -4.278914500
Apr 18 -2.7986740 49.58471  0.613967094
May 18  3.4073341 49.55116  1.141510737
Jun 18  9.0507920 49.51165  0.037559827
Jul 18 12.8484910 49.47214 -0.920632189
Aug 18 11.6057897 49.51379  0.680421399
Sep 18  7.4204835 49.55544 -0.675920190
Oct 18  0.4193803 49.62402  0.856598477
Nov 18 -6.4623380 49.69261 -1.830267869
Dec 18 -9.4996690 49.67170 -3.072030371
Jan 19 -9.3861026 49.65079  1.835309600
Feb 19 -9.6661427 49.62754  1.238607505
Mar 19 -6.9393442 49.60428  4.635066819
Apr 19 -2.7986740 49.64724 -0.248567990
May 19  3.4073341 49.69021 -0.697540749
Jun 19  9.0507920 49.70352  0.245692683
Jul 19 12.8484910 49.71682 -2.965314991
Aug 19 11.6057897 49.65832 -0.864108619
Sep 19  7.4204835 49.59981 -0.020297425
Oct 19  0.4193803 49.59106  0.689564398
Nov 19 -6.4623380 49.58230  4.680041208
Dec 19 -9.4996690 49.60552 -0.905855929
Jan 20 -9.3861026 49.62875 -0.842650592
Feb 20 -9.6661427 49.61924  0.946898450
Mar 20 -6.9393442 49.60974 -0.270391100
Apr 20 -2.7986740 49.54164  1.057038974
May 20  3.4073341 49.47353 -0.480868903
Jun 20  9.0507920 49.38389 -0.434684834
Jul 20 12.8484910 49.29425 -1.442741870
Aug 20 11.6057897 49.20101  0.993195345
Sep 20  7.4204835 49.10778  1.671737384
Oct 20  0.4193803 49.01779 -2.737175044
Nov 20 -6.4623380 48.92781  4.134527516
Dec 20 -9.4996690 48.83933 -1.539660388

Any ideas here? It seems just calculating the mean and applying it to the seasonal is not correct?

brandtg commented 9 years ago

I think you are right with this statement: "I think this is saying that this specific function is not used in stl, but loess is still used in the actual implementation".

The reason for this is that it looks like Loess is the only smoothing used in the Fortran code, other than moving average (stlma) which is called in the low pass filter stlfs like we are doing in Java. Here's the code that I'm looking at, for reference: https://gist.github.com/brandtg/a25aecd0fbcf25ab65d1.

Another notable difference in the Java and R versions is their implementation of Loess. We're using LoessInterpolator.

The Fortran code is quite complex, so I haven't done a thorough comparison between the two yet unfortunately. However, there are a lot of configuration parameters for the Java Loess, which if manipulated might get us closer to the R version.

I think we might need to refactor and add LOESS_ROBUSTNESS_ITERATIONS to StlConfig but the other bandwith configurations are easily accessible there. By the way, what StlConfig are you running tests with?

I'm hoping that this is just a matter of finding configuration parameters (either statically or derived from input parameters) that match the default R configuration well.

hntd187 commented 9 years ago

@brandtg I agree it's probably config based, they are close enough I'd imagine it's something to that end.

I'm using the default configs from the STLRunner, just so when I cite numbers I'm not using anything whacky. I ran the R implementation with the most similar values I could, but they have different parameters available for use.

final STLConfig config = new STLConfig();
config.setNumberOfObservations(12);
config.setNumberOfInnerLoopPasses(10);
config.setNumberOfRobustnessIterations(1);
config.setSeasonalComponentBandwidth(0.75);
config.setLowPassFilterBandwidth(0.30);
config.setTrendComponentBandwidth(0.10);
config.setNumberOfDataPoints(ts.length);
final STLDecomposition stl = new STLDecomposition(config);
brandtg commented 9 years ago

Got it. Yeah, the default configs didn't really have any basis other than that the output looked to be what the paper describes in a general sense, and is not necessarily exactly the same as the R implementation.

The main difference with the Java Loess seems to be that it takes the bandwidth as a fraction of points closest to the current point, whereas in the Fortran code, ns, nt, nl are all integral spans (at least 3 and odd). I'm guessing that this can lead to slightly different behavior.

Is there a reason for having exact parity with R, other than that it is a nice sanity check? IMO if we have defaults such that the function generally behaves correctly, as well as expose all the knobs via config, it's probably okay. Even the smoothing stuff as described in the paper is a little hand-wavey.

It's probably still good to add the "2.5 Post-smoothing of the seasonal" bit of code that we're talking about here though, to be consistent with the paper.

hntd187 commented 9 years ago

Yes, I agree. I think the difference in code, architecture, implementation and all of that I think we'll run into these problems, but I agree as long as we are fitting the series well and the residuals look alright than I believe it should be fine. Okay let me tie up and check in some of my work and you can look at it. I think since my implementation and yours create the same results, I'll remove mine and use yours and implement it.

hntd187 commented 9 years ago

3 Opened a pull request for this.

brandtg commented 9 years ago

Merged pull request #3