Frequency subsampling - Githubissues

ryancoe commented 4 years ago

The number of frequency components used in an analysis has a strong effect on the computational load to evaluate a design for (at least) two reasons:

NEMOH solver time scales linearly with number of frequencies
For solution of the pseudo spectral method (quadratic programming)

Thus, it is worthwhile to minimize the number of frequency components considered. This can be done based on the input spectrum (i.e., the sea state). The basic idea is to:

Look for relevant frequency range based on sea state
Subsample this range (i.e., select every 5th component)
Extend the frequencies up to ~3 harmonics of the highest frequency (this will only matter for extreme loads and things of this sort, not for average power)
Use these frequencies when running NEMOH
Use these frequencies when solving the PS problem

zmorrell-sand commented 4 years ago

@ryancoe I went ahead and did a linear interpolation using every 5th component to get a quick approximation of what you described. I obtained these values from a linear interpolation of original bretschneider spectrum. Attached are the graphs. The graph on the left is the original, and the graph on the right is the every 5th component interpolation. It looks a little bit coarse, but that can be fixed by doing a finer subsampling. It should also be easy enough to generalize to more sea states.

The other easier/better option, now that I think about it, we can specify the 'w' as an array input to the bretschneider function, such as bretschneider(linspace(0,3.2, 65), [8,10],0)

First method (interpolating after generating):

Second (probably better method):

ryancoe commented 4 years ago

@zmorrell-sand - I also thought about the second approach, but I think the first may prove to be more general. If we don't know what the spectrum looks like a priori, then we can't confidently set the frequency argument in the bretschneider function. Additionally, we will eventually want to work with empirical spectra, which might look something like this (the bretschneider is a smooth parametric form than approximates a real ocean spectrum):

1-s2 0-S1571995203800235-gr11

Thus, kudos for your investigating here, but I think we should go with the first approach.

zmorrell-sand commented 4 years ago

@ryancoe Ah, I hadn't considered other seastates. In that case, I agree that the first option is more general. I can write up a function that we put after the seastate call to do the interpolation. Regarding the three harmonics, what do you think would be a good way to do this. I am thinking of something like the following pseudocode:

S = bretchneider(...)
lastS = S.S(-1) //last element in the list
lastw = S.w(-1) //last element in the list

harmonicw = 3*lastw
harmonicS = 0

interpolate(lastW, lastS, harmonicw, harmonicS);

obviously, that will not run as written, but that is the basic thought process. Would that accomplish what you are wanting?

ryancoe commented 4 years ago

We can write a stand along function like this:

function New = subSampleFreqs(S)
ind_sp1 = find(S.S > 0.01*max(S.S),1,'first');
ind_sp2 = find(S.S > 0.01*max(S.S),1,'first');
ind_sp = ind_sp1:ind_sp2*3; % for the harmonics
Snew = interp...

Pass the frequency vector into NEMOH via [hydro,rundir] = RM3_getNemoh(geomMode,varargin{:});
The interp1s in lines 153:178 probably go away.

zmorrell-sand commented 4 years ago

@ryancoe I have a basic version made, but it doesn't quite work yet on the 0 side. It is getting late, so I am going to call it a day. I will commit the working version on Wednesday, but here is what I have so far.

function [new_w, new_S] = subSampleFreqs(S, npoints)
%subSampleFreq - subsamples sea state and interpolates to three harmonics
%    gets a subsampling of a given seastate by linear interpolation
%    Inputs:
%        S = seastate.  must have S.S and S.w
%        npoints = number of points to subsample
%    Outputs:
%        newS = new density values
%        neww = new frequency values

if(nargin < 2)
    npoints = 120;
end
ind_sp1 = find(S.S > 0.01 * max(S.S),1,'first');
ind_sp2 = find(S.S > 0.01 * max(S.S),1,'last');
new_w = linspace(S.w(ind_sp1), S.w(ind_sp2) * 3, npoints);
new_S = interp1(S.w, S.S, new_w);
new_S(isnan(new_S)) = 0;
end

zmorrell-sand commented 4 years ago

I ran the RM3_unitTest.m file with the subsampling thrown in, and all tests passed. Adding the subsampling cut the runtime of the tests to about a third of the original time, so it seems to be working as hoped. I will push my changes.

zmorrell-sand commented 4 years ago

fb7c8ee794fdf1e9318b1bc01f30452c8a75b502

ryancoe commented 4 years ago

@ssolson - This is the previous discussion of the problem about sub-sampling and interpolating the wave spectrum/BEM data. From my quick read of the above comments, I can't understand why the current code is not more logical/efficient. It sounds like you understand the problem well, but for documentation, the key principles are:

Based on some a priori understanding of the problem, it should be possible to avoid interpolation (or at least do it infrequently). The sea states are of a finite frequency range, and we can use this to specify the frequency values at which to run NEMOH.
However, we must be careful for (and therefore may want both unit tests and run-time checks):
- Having spectral and Nemoh data at super harmonics (see above: https://github.com/SNL-WaterPower/WecOptTool/issues/29#issue-579131577)
- Having sufficient resolution for the spectrum. For a synthetic spectrum, this is easy because the spectrum is smooth. However, an empirical spectrum my be more peaky.
- Having sufficient spectral resolution for the WEC (Nemoh). This will depend on the size of the WEC.

ryancoe commented 4 years ago

To summarize our previous discussion, the revised/refined workflow should be something like:

Create spectra
Find frequency range for all spectra in study
Based on a default frequency spacing, which can be set by the user alternatively, create a finite set of frequencies over the frequency range.
Check, via RMSE and moments, that the new frequency vector accurately represents the spectra
Use this frequency vector when calling Nemoh (make sure to augment for super harmonics)
As you loop through the spectra to evaluate performance, it will likely be beneficial to interpolate or take a subset of the frequency points from the BEM when considering each spectrum.

ssolson commented 4 years ago

I have a working version of my implementation of this on my subSampleW branch.

To accomplish all the things laid out in this issue it will still need development. I am hoping that @ryancoe & @H0R5E may be able to pull and run the example.m from my branch to look at what I am currently doing. Lines 36-42 show my implementation. There is a TEMPORARY HARDCODED plot true or false in the removeSpectraTails and downSampleSpectra functions (setting 1 to true at a time shows each function results visually pretty well).

I have made some implementation decisions which adapting from the original list Ryan Made above would look like this:

Load spectra (example 8 spectra)
Pre-Process Data: Iterate over each sea-state. Remove the tails.
Pre-Process Data: Iterate over each sea-state. Down-Sample the frequency bins constrained by a maximum error and the minimum number of bins. a. Check, via MAPE , that the new frequency vector accurately represents the spectra b. TODO Check, via moments, that the new frequency vector accurately represents the spectra
TODO: make sure to augment for super harmonics

My interest in the discussion for tomorrow meeting would focus on:

Do you guys also see this as a pre-processing spectra tool?
Should I focus on implementing the current functionality or adding more of the steps laid out here. If add more steps which do I need to focus on?

test_results.pdf

Some other things from Ryan's steps (above) I need to address: How to check via moments? How super harmonics?

Based on a default frequency spacing, which can be set by the user alternatively, create a finite set of frequencies over the frequency range.
As you loop through the spectra to evaluate performance, it will likely be beneficial to interpolate or take a subset of the frequency points from the BEM when considering each spectrum.

ryancoe commented 4 years ago

From out discussion on May 7th

set dw
Find the range where there is meaningful energy in the spectra: [f_min, f_max] - see image below from @gbacelli (swap f for w) - note that what @ssolson is currently doing (looking for a percentage of the max(S)) is also OK, but the approach shown in the drawing below is more robust
round f_min and f_max to get integer multiples of dw
To handle super harmonics, augment the array to m = 6 times higher: f = f_min:dw:f_max*m
return new spectrum and qualification(s) of error

New Note

Future improvements:

Add all spectra to make a super spectrum and operate on that
Operate on abs(Fe(w)).^2 (will need to call BEM with a coarse w vector to do this and then call BEM again)
Set freq. range based on approach in drawing above, not on percentage of max(S)
(don't need super harmonics when you're not doing PS...)

H0R5E commented 4 years ago

Having had a quick look at this, I am still concerned about allowing different sea states to be simulated against the same NEMOH calculation. Even with the same dw, two sea states A and B can have different f_min and f_max and therefore NEMOH would be run with different frequencies depending on which was used. Therefore, I would expect the results for simulating performance with sea state A, directly, to be different to the results for simulating sea state A, after having designed the device (i.e called NEMOH) using seastate B. This gives us inconsistent results.

I think there is only two ways of addressing this and that's to use a fixed set of frequencies for NEMOH (at a given discretisation) or to run NEMOH for every new sea state simulated.

What do you guys think?

H0R5E commented 4 years ago

A third option is to only allow one sea state to be simulated.

ryancoe commented 4 years ago

Let's talk through this on our call tomorrow (@ssolson may be on vacation). My initial response is that is was my intention for [f_min, f_max] to be set based on all the spectra, not just one (am I forgetting something?).

H0R5E commented 4 years ago

Yes, it is set based on all the spectra in a single sea state (say S1), but if you've set up your device like:

geomMode.type = 'parametric';
geomMode.params ={5 7.5 1.125 42 S1 0.5};
cntrlMode.type = 'CC';
device = blueprint.makeDevices(geomMode, cntrlMode);

then you can use the device to simulate another sea state (S2) as follows:

device.simulate(S2);

What I'm saying is that, say, if the device were built with a different sea state (S3), i.e.:

geomMode.type = 'parametric';
geomMode.params ={5 7.5 1.125 42 S3 0.5};
cntrlMode.type = 'CC';
device = blueprint.makeDevices(geomMode, cntrlMode);

then device.simulate(S2); won't give you the same answer because the f_min and f_max of S1 and S3 will probably differ.

ryancoe commented 4 years ago

Ahh... understood. One thing that might have thrown me is what looks like a mismatch in language. I would define:

Sea state (same as spectrum): a single spectral density curve S = f(w) with units [m^2 s / rad]
Sea state array or spectra array (could also call this something like a resource representation...): multiple spectra, with some sort of weighting mu

Anyhow, I see the problem now. Two thoughts:

We can make a plot each time you call device.simulate(S) that somehow shows the interpolation of the spectrum onto the BEM frequencies. You can turn off this plot through an optional argument.
I suppose it would inexpensive to also call some check at the beginning of the simulate method that makes sure you're not making a mistake with this (e.g., compare MAPE for the input spectrum with that interpolated onto the BEM frequencies)
(don't really like this idea) Make a simulator class, instantiated with a sea state and device, that has a method runSimulation()

I strongly prefer something along the lines of 1 and/or 2, following the ethos that I would rather keep the code simpler and accept some additional risk that someone who isn't paying attention makes a mistake.

H0R5E commented 4 years ago

Yeah, sorry, I was playing fast and loose with "sea state" there, in the code they are SeaState object arrays, so it's consistent with your definition.

I don't like the idea of producing a plot by default as it makes a mess of loops. Option 2 would effectively restrict you to using sea state arrays with similar f_min, f_max, dw, right? This is probably the most "correct" way to do it, IMO.

On a related note, there is some discussion to be had about how much control the user should have with the settings afforded by this PR. I need to move some stuff around first, but I will come back with the question of what, if anything, we should automatically apply and whether we fix settings at a certain point. I think there are a few alternatives available.

H0R5E commented 4 years ago

@ryancoe, I need a little help. Looking at the getDynamicModel function of RM3.m, it seems like the model requires the frequencies to be equally spaced, e.g.

    % Calculate w step-size
    if length(iSpec) == 1
        dw = wStep;    
    else    
        dw = mean(diff(S.w))*iSkip;   
    end

    % Get column vector S at same indicies as w (Removed interpolation). 
    s = S.S(iStart:iSkip:iEnd);
    % TODO: is interp needed? %s = interp1(S.w(:), S.S, w,'linear',0);
    % Calculate wave amplitude
    waveAmp = sqrt(2 * dw * s);

I was wondering if we should enforce this? Either by throwing an error or automatically resampling in the SeaState class if a non equally spaced spectrum is given, or by throwing an error in getDynamicModel in such circumstances. Do you have any thoughts on this?

H0R5E commented 4 years ago

One other thing, can you explain to me why we need to use "integer multiples of dw" for frequencies? I found that it's a little problematic for the resampling error, because you don't get a perfect match when you use the same dw as the original spectrum (because the starting frequency has changed). Is it for NEMOH?

ssolson commented 4 years ago

Mat the Sea-states PR I am doing will replace this code you have shown. I never deleted it from the PR but that is the final intention. So I believe it should be enforced through that resampling and adding a check somewhere when the code is run?

ryancoe commented 4 years ago

Why do the frequencies need to be evenly spaced?: Good question. The methods we're using (PS and the frequency domain calculations for P and CC) assume that the frequencies have a fixed dw and that dw is the fundamental (first element); in #140, I actually did start explicitly requiring this and I think we should go ahead and do this everywhere My preference would be to throw an error instead of fixing the frequency vector for the user

https://github.com/SNL-WaterPower/WecOptTool/blob/5bd919e87de07647ffa01e48b7402ff8bc6e8566/toolbox/%2BWecOptLib/%2Bmodels/%40WaveBot/WaveBot.m#L87

@gbacelli - Can you add anything?

ryancoe commented 4 years ago

See here for how we did comparison of spectra in MHKiT

https://github.com/MHKiT-Software/MHKiT-Python/blob/cb3584c1d7fe37bf449b2b4fdb5a3cba1337b15d/mhkit/tests/test_wave.py#L124

H0R5E commented 4 years ago

@ryan, following our call, I was wondering whether comparing the (numerical) integrals of the original and modified might be useful for determining the error of the resampling (i.e. using the error in the energies)? It might be easy to go from there and use it for the tail extraction as well. What do you think?

ryancoe commented 4 years ago

Yes, good idea - this would be the zeroth moment.

H0R5E commented 4 years ago

I probably knew that once. ;)

H0R5E commented 4 years ago

OK, I gave this a try, but I still think that using the absolute density error generates a better result for a similar number of steps. See the plots for a comparison:

specific_energy_error mean_density_error

Because it's absolute, performance does seem to deteriorate with lower energy spectra:

mean_density_error_lowS

If you normalise by the maximum density then you get a much more consistent result, so I think this is probably the best way to go:

mean_density_normal_highS mean_density_normal_lowS

EDIT: Those last 2 plots should say 1%, not 0.01%.

gbacelli commented 4 years ago

Why do the frequencies need to be evenly spaced?: Good question. The methods we're using (PS and the frequency domain calculations for P and CC) assume that the frequencies have a fixed dw and that dw is the fundamental (first element); in #140, I actually did start explicitly requiring this and I think we should go ahead and do this everywhere My preference would be to throw an error instead of fixing the frequency vector for the user

https://github.com/SNL-WaterPower/WecOptTool/blob/5bd919e87de07647ffa01e48b7402ff8bc6e8566/toolbox/%2BWecOptLib/%2Bmodels/%40WaveBot/WaveBot.m#L87

@gbacelli - Can you add anything?

Equal frequency spacing gives you orthogonality of each component, which is a good thing for numerical calculations. Essentially, if dw si not constant you can't apply Fourier (e.g. FFT, Fourier series etc...)

ryancoe commented 4 years ago

Per phone call, all elements of w must be integer multiples of dw. You can still have, e.g., w = (21:41) * dw

gbacelli commented 4 years ago

The WecOptTool considers absorbed power as a variable in the objective function. Thus, for the interpolation I think we should make sure that the interpolated spectra contain the same amount of average power.

The average power contained in the spectra from WAFO S(w) is:

The power carried by each frequency component i is :

which should equal the amount of power in the frequency range (frequency bin) of width dw and centered at the frequency w_i, which, in turns is equal to:

So, at the end, the amplitudes A_i, for the spectra at the reduced frequency resolution, should be calculated as

gbacelli commented 4 years ago

In Matlab, I would implement it this way:

w = 0:0.001:10;
S = jonswap(w, [1, 10]);
f = @(x) interp1(w, S.S, x);

dw = 0.033;
wmin = 0.5;
wmax = 2*pi;

w_low = (round(wmin/dw)*dw):dw:(round(wmax/dw)*dw); % low resolution frequency vector
w_low_l = w_low - dw/2;
w_low_h = w_low + dw/2;

A = zeros(length(w_low),1);

tic
for ii = 1:length(w_low)
    A(ii) = 2*sqrt( integral(f, w_low_l(ii), w_low_h(ii)));
end
toc

H0R5E commented 4 years ago

Per phone call, all elements of w must be integer multiples of dw. You can still have, e.g., w = (21:41) * dw

OK, thanks for the clarification. We should check that the inputs to SeaState also conform.

H0R5E commented 4 years ago

The WecOptTool considers absorbed power as a variable in the objective function. Thus, for the interpolation I think we should make sure that the interpolated spectra contain the same amount of average power.

I think the problem is that ensuring equal power in the spectrum is not a guarantee of equal power from the device. This approach will "smudge" the power across a wider band of frequencies which may then behave differently in terms of device power, depending on the frequency response of the device.

ryancoe commented 4 years ago

I think the problem is that ensuring equal power in the spectrum is not a guarantee of equal power from the device. This approach will "smudge" the power across a wider band of frequencies which may then behave differently in terms of device power, depending on the frequency response of the device.

True. But this is always true for an subsampling that we do. If you have a peaky wave spectrum (e.g., measured from buoy data) or a device with a peaky response, you run this risk. The good news is that most good WEC designs do no have peaky responses and that idealized spectra (e.g., JONSWAP, Bretschneider, etc.) are not very jagged - even jagged measured spectra don't really contain that much energy in the spikes.

H0R5E commented 4 years ago

OK, well whatever you think is best. Perhaps @ssolson https://github.com/ssolson could take this back on?

Dr. Mathew Topper Computational Science & Marine Energy Consultant

Data Only Greater 7 Parklands Lodge Maynooth Co. Kildare W23 E8C6 Ireland

Email: mathew.topper@dataonlygreater.com Skype: doctor.topper Phone: +353 (0)85 166 3178 Web: www.dataonlygreater.com

On Tue, 7 Jul 2020 at 17:38, Ryan Coe notifications@github.com wrote:

I think the problem is that ensuring equal power in the spectrum is not a guarantee of equal power from the device. This approach will "smudge" the power across a wider band of frequencies which may then behave differently in terms of device power, depending on the frequency response of the device.

True. But this is always true for an subsampling that we do. If you have a peaky wave spectrum (e.g., measured from buoy data) or a device with a peaky response, you run this risk. The good news is that most good WEC designs do no have peaky responses and that idealized spectra (e.g., JONSWAP, Bretschneider, etc.) are not very jagged - even jagged measured spectra don't really contain that much energy in the spikes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/SNL-WaterPower/WecOptTool/issues/29#issuecomment-654984068, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABG2KF5A6UTZLFT2GOU6SVTR2NFSBANCNFSM4LFRTWJA .

ssolson commented 4 years ago

I will. I suggest getting the current PR through as it near completion and a step in the right direction. Then opening a much smaller PR to address this specific enhancement.

gbacelli commented 4 years ago

I think the problem is that ensuring equal power in the spectrum is not a guarantee of equal power from the device. This approach will "smudge" the power across a wider band of frequencies which may then behave differently in terms of device power, depending on the frequency response of the device.

True. But this is always true for an subsampling that we do. If you have a peaky wave spectrum (e.g., measured from buoy data) or a device with a peaky response, you run this risk. The good news is that most good WEC designs do no have peaky responses and that idealized spectra (e.g., JONSWAP, Bretschneider, etc.) are not very jagged - even jagged measured spectra don't really contain that much energy in the spikes.

Exactly. I would also add that for the case of nonlinear effects we are already averaging over multiple phase realizations.

We can run some comparisons simulations and include it in the documentation, and leave it to the user to choose best frequency spacing.

ssolson commented 4 years ago

Reopening as gbacelli's resampling is not completed in #143

ssolson commented 3 years ago

Closing until interest in WecOptTool advanced frequency subsampling is renewed.

SNL-WaterPower / WecOptTool-MATLAB

Frequency subsampling #29