MHKiT-Software / MHKiT-Python

MHKiT-Python provides the marine renewable energy (MRE) community tools for data processing, visualization, quality control, resource assessment, and device performance.
https://mhkit-software.github.io/MHKiT/
BSD 3-Clause "New" or "Revised" License
47 stars 45 forks source link

Improve execution time for surface elevation computation #229

Closed mbruggs closed 10 months ago

mbruggs commented 1 year ago

This is primarily placeholder, I'm happy to contribute this improvement when I get a moment. The code below needs generalising to properly use the Dataframe that is passed in.

Describe the bug:

The computation of the surface elevation from a spectrum uses a 'sum of sines' approach which is prohibitively slow. During some experiments, trying calculate a 3hr sea state at 20Hz resolution crashed the python process on my machine, a 1hr sea state took 11s.

Alternative approach

Alternatively, we can use an approach using an ifft. For the same calculation that previously took 11s, the ifft version took 0.007s.

The code below (loosely adapted to match current MHKit structure) gave the same results as the previous surface_elevation method using the sum of sines.

def surface_elevation(S, time_index, seed=None, frequency_bins=None, phases=None)
    duration_s = time_index[-1]
    dt = time_index[1] - time_index[0]

    # see notes section in
    # https://numpy.org/doc/stable/reference/generated/numpy.fft.irfft.html
    if time.size % 2 != 0:
        time = np.append(time, [duration_s + dt])

    f = S.index
    df = f[1] - f[0]
    if phases is None:
        np.random.seed(seed)
        phases = 2 * np.pi * np.random.rand(f.size)

    # Convert from energy to amplitude spectrum
    amp = np.sqrt(2 * S.values.squeeze() * df)
    amp[0] = 0  # 0 frequency gives NaN value in spectrum

    amp_re = amp * np.cos(phases)
    amp_im = amp * np.sin(phases)
    amp_complex = amp_re + 1j * amp_im

    eta = np.fft.irfft(0.5 * amp_complex * time.size, time.size)
ssolson commented 1 year ago

That is an incredible time improvement and we should use this assuming there are no other tradeoffs. @cmichelenstrofer could you comment on @mbruggs proposed approach?

cmichelenstrofer commented 1 year ago

@ssolson The IFFT is the better approach for this. The only limitation is that the frequencies have to be evenly spaced (but I believe the current function enforces this anyways). In some applications (rare in practice tbh) you might want uneven spacing, such as the "equal energy binning". So maybe there is still an advantage to keeping a separate function that uses the sum of sines? @ryancoe thoughts?

ryancoe commented 1 year ago

I'd say it'd be relatively easy to check that frequency spacing and then select either the ifft or summation of sinusoids depending on whether the spacing is equal.