Dataset returning non-stationary process

RhysU commented 11 years ago

Nick has found an input data set, saved as file issue3.dat, which results in a nonstationary AR(p) process when fed into arsel:

./arsel --subtract-mean < issue3.dat

The nonstationarity manifests itself in a NaN in the T0. Investigating the AR(p) process poles shows that some are larger than one in absolute value. Algorithmically, I do not believe this should be possible with Burg's method and hence something is amiss.

RhysU commented 11 years ago

Some quick code-to-code comparisons in Octave show the following:

%% Initialize
format long
load rhoe.dat

%% Implementation under test
a = arsel(rhoe',false);
a.AR{1}

%% ARMAsel by Broersen
[bAR, bMA] = armasel(rhoe)

%% arburg from the Octave signal package
pkg load signal
c = arburg(rhoe,6)

%% Observe how well ARMAsel and arburg agree
bAR' - c'
% ans =
%    0.00000000000000e+00
%   -9.91173365605391e-10
%    2.66611577259823e-09
%   -2.73298295105917e-09
%    1.46097456354966e-09
%   -5.00774033351803e-10
%    9.78394379402836e-11

% Observe that we disagree with both by roughly the same magnitude
a.AR{1}' - bAR'
%ans =
%   0.00000000000000e+00
%   7.50698463392041e-08
%  -2.01927136345148e-07
%   2.06991588536098e-07
%  -1.10651835694142e-07
%   3.79278468076727e-08
%  -7.41022140871017e-09

a.AR{1}' - c'
%ans =
%   0.00000000000000e+00
%   7.40786729735987e-08
%  -1.99261020572550e-07
%   2.04258605585039e-07
%  -1.09190861130593e-07
%   3.74270727743209e-08
%  -7.31238197076989e-09

Conclusions:

Differences between ARMAsel and arburg I will call 'acceptable floating point error' for rhoe.dat
arsel is appreciably above acceptable floating point error and, presumably, buggy in some slight manner
Given that arburg has freely available source from the Octave signal package and I can read it easily, debugging should proceed from "why doesn't arsel reproduce arburg on rhoe.dat?"

RhysU commented 11 years ago

Lastly, previous comment data was shown at arsel built with -O0 but there's no appreciable difference between its results at -O2.

RhysU commented 11 years ago

I've added a verbatim copy of Collomb's original code as issue3.cpp. Running

./issue3 6 < rhoe.dat

permits comparison with

./arsel < rhoe.dat

and should ideally produce nearly identical results. They differ a digit or two earlier than I suspect they should. Moreover, the former is closer to ARMAsel and arburg.

RhysU commented 11 years ago

I believe I can explain the discrepancy between issue3 and arsel based solely on the use of ar::welford_nvariance perturbing the initial value of Dk used in the latter relative to the former. That is, using

       Value Dk = - f[0]*f[0] - f[N - 1]*f[N - 1]
                + 2*inner_product(f.begin(), f.end(), f.begin(), Value(0));

instead of the expression in c42b623b7379566d5c0302db2fec78f33e400cbd causes

vimdiff <(./arsel < rhoe.dat) <(./issue3 6 < rhoe.dat)

to show no difference in the numerical outputs.

Both

whether removing ar::welford_nvariance in arsel causes a stationary process to be returned on issue3.dat
whether consistently using ar::welford_ncovariance instead of std::inner_product causes a stationary process to be returned on issue3.dat remain to be seen.

RhysU commented 11 years ago

Using the newer welford_inner_product uniformly in burg_method still causes a NaN in T0 for issue3.dat. It does, however, improve the code-to-code error by a single digit.

RhysU commented 11 years ago

Using only std::inner_product pervasively within burg_method still causes NaN in T0 for issue3.dat. When not subtracting the mean and using std::inner_product,

vimdiff <(./arsel < issue3.dat) <(./issue3 21 < issue3.dat)

matches to every last digit.

My implementation of Collomb's material no longer feels suspect. I now need to check Collomb's implementation, as embodied in issue3.cpp to see if it returns stationary poles and to see if arburg does too.

RhysU commented 11 years ago

I documented how to find process poles in 0340adbe7460e99b9fc0e22307bbd34053c1f893.

With this procedure, Collomb's implementation is indeed suspect as arburg in Octave shows a stationary result:


octave:1> pkg load signal
octave:2> load issue3.dat
octave:3> [a,v,k] = arburg(issue3 - mean(issue3), 23);
octave:4> p = 1./roots(fliplr(a));
octave:5> abs(p)
ans =

   0.923725596050735
   0.925431996245389
   0.925431996245389
   0.925580374177076
   0.925580374177076
   0.926743838831280
   0.926743838831280
   0.928088357854714
   0.928088357854714
   0.933964914226998
   0.933964914226998
   0.981886026884414
   0.981886026884414
   0.982753255492121
   0.982753255492121
   0.983628188840103
   0.983628188840103
   0.958490360945454
   0.958490360945454
   0.985643046802406
   0.985643046802406
   0.996302237447710
   0.996302237447710

Compare that with Collomb's approach as implemented in issue3.cpp:

octave:1> load issue3.dat;
octave:2> d = issue3 - mean(issue3);
octave:3> save -ascii d.dat d;
octave:4> system("./issue3 23 < d.dat > p.out");
octave:5> load p.out;
octave:6> abs(1./roots(flipud(p)))
ans =

   1.88940538068673e-03
   3.19672152141396e-03
   1.01211506678184e-02
   2.06485130552164e-01
   1.00932745736238e+00
   1.00929251980753e+00
   1.00929251980753e+00
   8.19759562519569e-01
   1.00902417558368e+00
   1.00902417558368e+00
   1.00795433644856e+00
   1.00795433644856e+00
   1.00517166035139e+00
   1.00517166035139e+00
   9.99952806795258e-01
   9.99952806795258e-01
   9.99985083130086e-01
   9.99985083130086e-01
   1.22175908145864e+00
   2.68920966031252e+01
   1.62509900092322e+02
   3.69097249179961e+02
   5.54748369654496e+02

This clearly is a nonstationary result from Collomb's algorithm because there are poles outside the unit circle.

phew This bug has origins in a bad algorithm writeup by Collomb and not in some subtle bit of my implementation, I think. I need to determine what it is about Collomb's algorithm that differs from arburg and to incorporate that into ar.hpp.

RhysU commented 11 years ago

I have attempted to contact Cedrick Collomb. Will update when he gets back to me if not before.

RhysU commented 11 years ago

Using a hacked up (but numerically identical) version of Octave's arburg I obtain the following result:

octave:2> load ~/Build/ar/issue3.dat
octave:3> format long
octave:4> format compact
octave:5> [a,v,k]=archop(issue3,6)

                            v =  645.914610708842
last_k = -0.999971419191272 v =    0.0369209962631440
last_k =  0.999451228373179 v =    4.05112715972809e-05
last_k = -0.999177921437641 v =    6.65795177774512e-08
last_k =  0.997901260018939 v =    2.79172928323599e-10
last_k = -0.997659483833322 v =    1.30528819033259e-12
last_k =  0.995472493017744 v =    1.17926465768418e-14

a' =
    1.000000000000000
   -5.983812819711469
   14.930773410432128
  -19.884897788526281
   14.908199140831057
   -5.965734435604825
    0.995472493017744

v =  1.17926465768418e-14

k =
  -0.999971419191272
   0.999451228373179
  -0.999177921437641
   0.997901260018939
  -0.997659483833322
   0.995472493017744

This will serve as the gold solution for debugging whatever's up with Collomb's implementation.

RhysU commented 11 years ago

Hardcoding the last_k values from arburg within issue3.cpp reproduces arburg's results. The problem is the computation of the denominator of last_k which corresponds to the update of Dk in Collomb's presentation.

RhysU commented 11 years ago

Turns out that Collomb's implementation, now renamed to collomb2009.cpp from issue3.cpp, is nothing but the recursive denominator algorithmic variant due to [Andersen1978], given as C code by [Faber1986], and coded up in the new faber1986.cpp.

collomb2009.cpp and faber1986 agree to machine precision on rhoe.dat for p=6 but diverge wildly when run against issue3.dat. Bingo, algorithmic variant instability demonstrated using sample code published in IEEE. Proc.

Found the smoking gun. On page 1582, [Andersen1978] says

However, for extremely low-noise signals, accumulation of round-off errors may occasionally become important in the numerator, causing the computed value of am to become slightly larger than unity, though this is not algebraicly cf (1) . The problem may be avoided in these cases by returning to the defining equation (1)...

[Andersen1978]'s motivation was avoiding some FLOPS during computation of the reflection coefficient. A reasonable implementation of the vanilla, non-recursive-denominator algorithm should be able to continue working with a single pass through memory and obtain good performance on modern systems even without this unstable optimization.

To close this whole mess out, I plan to:

Move to a non-[Faber1986], non-[Collomb2009] vanilla variant in ar.hpp.
Confirm that the updated version no longer exhibits the instability.
Update the Doxygen and licensing headers to reflect that we're no longer based on Collomb.

I will then file a new issue capturing that

Add an ar::welford_inner_product-like routine computing a compensated reflection coefficient to be sure accumulation errors on long signals aren't killing us.
Compare the before/after results for issue3.dat in terms of the complex pole locations.
Move to the compensated variant if necessary.

RhysU commented 11 years ago

c3fd62067172d120280f3f6a0bcec0b493da1fa5 temporarily turns off the compensated reflection coefficient computation, and it nails the Octave arburg results for test{1,2,3}.dat to machine precision.

RhysU commented 11 years ago

08615f9e91e0777600c9a7449df9c83698b44158 closes the ticket.

RhysU / ar

Dataset returning non-stationary process #3