Signal reconstruction outside the data range on which the nfft was calculated

a-z-e-r-i-l-a commented 4 years ago

Hi,

I did an experiment with the output of the nfft which I was not sure how to interpret it. First I calculate the nfft_adjoint and then based on the output try to reconstruct the result which works.

N = 100000
K = -N / 2 + np.arange(N)
fk_hat = nfft.nfft_adjoint(z, r, N)

fk = 0
for i, k in enumerate(K):
    fk += fk_hat[i]*np.exp(-2*np.pi*1j*k*z)

plt.plot(z, fk/(N), '-', color='red', label='nfft')
plt.plot(z, r, '.', color='blue', label='data')
plt.legend()

Then I augment the input z to z3 and calculate the reconstruction on it:


z1 = np.linspace(-700,-422,20)
z2 = np.linspace(424,700, 20)
z3 = np.concatenate((z1, z, z2))

N = 100000
K = -N / 2 + np.arange(N)
fk_hat = nfft.nfft_adjoint(z, r, N)

fk = 0
for i, k in enumerate(K):
    fk += fk_hat[i]*np.exp(-2*np.pi*1j*k*z3)

plt.plot(z3, fk/(N), '-', color='red', label='nfft')
plt.plot(z, r, '.', color='blue', label='data')
plt.legend()

It is also the same output if I use the original forward fft.

N= 100000
k = -N / 2 + np.arange(N)
r_hat = nfft.nfft_adjoint(z, r, N)

r_reconstructed = nfft.nfft(z3, r_hat) / N
plt.plot(z, r, '.', color='blue')
plt.plot(z3, r_reconstructed, '-', color='red')

My expectation was to see also the same wave continue osculating outside the data range the adjoing nfft was calculated on it.

Isn't the purpose of nfft finding the true frequency components of a signal similar to fft, or have I been using it incorrectly.

my data is in addition as follows:

r = np.array([119.75024144, 119.77177673, 119.79671626, 119.81566188,
       119.81291201, 119.71610143, 119.24156708, 117.66932347,
       114.22145178, 109.27266933, 104.57675147, 101.63381325,
       100.42623807, 100.09436745, 100.02798438, 100.02696846,
       100.05422613, 100.12216521, 100.27569606, 100.60962812,
       101.32023289, 102.71102637, 105.01826819, 108.17052642,
       111.67848758, 114.78442424, 116.95337537, 118.19437002,
       118.84307457, 119.19571404, 119.40326818, 119.53101551,
       119.61170874, 119.66610072, 119.68315253, 119.53757829,
       118.83748609, 116.90425868, 113.32095843, 108.72465638,
       104.58292906, 101.93316248, 100.68856962, 100.22523098,
       100.08558767, 100.07194691, 100.11193397, 100.19142891,
       100.33208922, 100.5849306 , 101.04224415, 101.87565882,
       103.33985519, 105.63631456, 108.64972952, 111.86837667,
       114.67115037, 116.69548163, 117.96207449, 118.69589499,
       119.11781077, 119.36770681, 119.51566311, 119.59301667])

z = np.array ([-422.05230434, -408.98182253, -395.78387843, -382.43143962,
       -368.92341485, -355.26851343, -341.47780372, -327.56493425,
       -313.54536462, -299.43740189, -285.26768576, -271.07676026,
       -256.92098157, -242.86416227, -228.95449427, -215.207069  ,
       -201.61590575, -188.17719265, -174.89201262, -161.75452196,
       -148.74812279, -135.85126854, -123.04093538, -110.29151714,
        -97.57502515,  -84.86119278,  -72.1145478 ,  -59.2947726 ,
        -46.36450604,  -33.29821629,  -20.08471733,   -6.72030326,
          6.80047849,   20.48309726,   34.32320864,   48.30267819,
         62.393214  ,   76.56022602,   90.76260159,  104.94787451,
        119.04731699,  132.98616969,  146.71491239,  160.23436159,
        173.58582543,  186.81849059,  199.96724955,  213.05229133,
        226.08870416,  239.09310452,  252.08377421,  265.0769367 ,
        278.08234368,  291.10215472,  304.13509998,  317.18351924,
        330.25976991,  343.38777732,  356.59626164,  369.90725571,
        383.33109354,  396.87227086,  410.5309987 ,  424.28994387])

Thanks.

jakevdp commented 4 years ago

The purpose of the NFFT is to solve the two equations listed here in O[N log M] time rather than the naive O[N M] time.

I suspect you can probably map what you want to do onto these two equations, with a bit of effort. At that point, you could use the NFFT to do it faster.

jakevdp commented 4 years ago

Thought about this a bit - the real issue is that the NFFT constructs a Fourier basis based on a fundamental frequency that is related to the extent of the data, and in your case there is no guarantee that this fundamental frequency matches the fundamental frequency of your model. So if you want to do this kind of extrapolation, you'll first have to determine the fundamental frequency on which to construct your Fourier basis.

One way to do this is via a least-squares fitting of a likelihood function minimized over frequency; this can be expressed in terms of the nfft, but simply putting your data into an NFFT won't be much help. One possible approach for this is a Lomb-Scargle periodogram, which is implemented in astropy: https://docs.astropy.org/en/stable/timeseries/lombscargle.html (it's implemented using something very similar to an NFFT, but provides the high-level interface you're looking for).

So to construct your inference model, I would start with that, find the best-fit frequency (subject to the caveats listed there) and then use the model() method to build your frequency model.

So why isn't this as easy as just computing the FFT as in the uniformly-sampled case? Well, for uniformly sampled data, you can only detect frequencies that are an integer multiple of the data baseline, for reasons you can read about in any FFT theory handbook. For unevenly spaced data, your data can constrain virtually any frequency, so you have to be more careful.

Hope that helps.

a-z-e-r-i-l-a commented 4 years ago

@jakevdp Yes, that completely worked for what I was looking for! Thank you!!

jakevdp / nfft

Signal reconstruction outside the data range on which the nfft was calculated #15