BUG: xarray `MergeError` in `wec.post_process` for certain base frequencies

michaelcdevin commented 1 year ago

Describe the bug For some values of f1, wec.post_process will fail on the merge step in this line with the following exception:

xarray.core.merge.MergeError: conflicting values for variable 'freq' on objects to be combined. You can skip this check by specifying compat='override'.

To Reproduce Run the below script using Python:

Python code

```python import numpy as np import capytaine as cpy import wecopttool as wot wot.set_loglevel("INFO") ## FloatingBody wb = wot.geom.WaveBot() mesh_size_factor = 0.5 mesh = wb.mesh(mesh_size_factor) fb = cpy.FloatingBody.from_meshio(mesh, name="WaveBot") fb.add_translation_dof(name="Heave") ndof = fb.nb_dofs stiffness = wot.hydrostatics.stiffness_matrix(fb).values mass = wot.hydrostatics.inertia_matrix(fb).values ## Wave parameters amplitude = 0.5 period = 9.5 wavefreq = 1 / period f1 = wavefreq / 5 nfreq = 5 waves_regular = wot.waves.regular_wave(f1, nfreq, wavefreq, amplitude) ## PTO force/constraint kinematics = np.eye(ndof) controller = None loss = None pto_impedance = None pto = wot.pto.PTO(ndof, kinematics) f_add = {'PTO': pto.force_on_wec} ## Create WEC object freq = wot.frequency(f1, nfreq, False) bem_data = wot.run_bem(fb, freq) wec = wot.WEC.from_bem(bem_data, inertia_matrix=mass, hydrostatic_stiffness=stiffness, f_add=f_add, ) ## Solve obj_fun = pto.mechanical_average_power nsubsteps = 5 nstate_opt = 2 * nfreq results = wec.solve( waves_regular, obj_fun, nstate_opt, scale_x_wec=1e1, scale_x_opt=1e-2, scale_obj=1e-2, ) ## Post-process wec_fdom, wec_tdom = wec.post_process(results, waves_regular, nsubsteps=nsubsteps)` ```

Expected behavior wec.post_process runs without issue.

Observed behavior An xarray.MergeError exception is thrown as shown above.

System:

OS: Windows 11
Python version: 3.10.12
WecOptTool version: 2.6.0
Any other relevant package version: capytaine 2.0, wavespectra 3.14.0

Additional information: Debugging this a bit, it looks like there are some floating point rounding errors between the frequency vectors in the Datasets of the waves and the WEC. Debugging the attached script:

> waves_regular.freq.values
# array([0.02105263, 0.04210526, 0.06315789, 0.08421053, 0.10526316])

> wec.frequency[1:]
# array([0.02105263, 0.04210526, 0.06315789, 0.08421053, 0.10526316])

> waves_regular.freq.values == wec.frequency[1:]
# array([False, False, False, False, False])

ryancoe commented 11 months ago

This is curious...

Rabbit hole that you can probably skip

------------------------------------------ to make a regular wave 1. the user [specifies `f1` and `nfreq`](https://github.com/sandialabs/WecOptTool/blob/e6cda25fd4fed39e2354237b75417c7f9fee03be/wecopttool/waves.py#L121) 2. which are passed to [`elevation_fd`](https://github.com/sandialabs/WecOptTool/blob/e6cda25fd4fed39e2354237b75417c7f9fee03be/wecopttool/waves.py#L53) 3. which calls [`wot.frequency`](https://github.com/sandialabs/WecOptTool/blob/e6cda25fd4fed39e2354237b75417c7f9fee03be/wecopttool/core.py#L1308-L1335) and [`wot.frequency`](https://github.com/sandialabs/WecOptTool/blob/e6cda25fd4fed39e2354237b75417c7f9fee03be/wecopttool/core.py#L1308-L1335) is the same function called [when you create a `wec` object](https://github.com/sandialabs/WecOptTool/blob/e6cda25fd4fed39e2354237b75417c7f9fee03be/wecopttool/core.py#L208). ------------------------------------------

I experience the same problem as you do on my very different machine and installation

OS: macos 13.6
Python version: 3.11.5
WecOptTool version: 2.6.0
Any other relevant package version: wavespectra 3.15.1, capytaine 2.0

... Digging a bit more, I can see that the change happens when we call wot.run_bem, and more specifically, I can see that when we create the test_matrix dataset (see below), the omega field still matches everything else, so (not too surprising) the machine precision change comes from calling solver.fill_dataset (i.e., from capytaine, likely some compiled Fortran operation).

https://github.com/sandialabs/WecOptTool/blob/e6cda25fd4fed39e2354237b75417c7f9fee03be/wecopttool/core.py#L2124-L2147

The way I see it, we could

shift the frequencies in the BEM solution back to where they should be (this is often something like 1e-16 rad/s)
do something smart with scaling to get away from very small numbers - i think this would be tricky in practice and probably turn into a mess
do something when we merge the datasets - there are some optional arguments for xr.merge, I've experimented with these a bit but don't really yet feel comfortable with the appropriate solution here

Any thoughts @michaelcdevin @cmichelenstrofer?

ryancoe commented 11 months ago

One more thought on this... this whole thing starts with having a fundamental frequency that is a really long float

## Wave parameters
amplitude = 0.5
period = 9.5
wavefreq = 1 / period
f1 = wavefreq / 5
nfreq = 5

gives f1=0.021052631578947368

If you use wavefreq = np.floor(1e5 / period)/1e5, f1=0.021052 and the whole problem goes away.

Suggested solution: Based on this, I think we should do some rounding in wot.frequency

cmichelenstrofer commented 11 months ago

we could check that the two arrays are all close (np.allclose) and then manually change the BEM frequencies back to the original. This can be done all the time, but only would have an effect on this cases.
Don't completely follow, but does not seem like the best option.
If this works I think it is the best option. It seems the join="override" option will force the indexes of the first array onto the second.

"if indexes are of same size, rewrite indexes to be those of the first object with that dimension. Indexes for the same dimension must have the same size in all objects."

ryancoe commented 11 months ago

the indexes are not of the same size (zero frequency is not in the waves dataset)

cmichelenstrofer commented 11 months ago

Suggested solution: Based on this, I think we should do some rounding in wot.frequency

I like this! I would make it an optional argument (number of sig figs?) and allow for None if someone wants to really use the f1 they provide exactly.

michaelcdevin commented 11 months ago

Suggested solution: Based on this, I think we should do some rounding in wot.frequency

I like this! I would make it an optional argument (number of sig figs?) and allow for None if someone wants to really use the f1 they provide exactly.

Agreed! I think an optional argument defining number of decimals in wot.frequency with a default >=1e-10 is the most straightforward solution.

cmichelenstrofer commented 10 months ago

I just discussed with @ryancoe: We only changed the precision of f1 in the frequency() function, but there are several other functions in core that

use f1 directly. E.g. time
There are many others that use it indirectly. They take it as an argument and then pass it to one of the functions that use it directly. E.g. waves.regular_wave() takes f1 and passes it to frequency()

Some options:

have precision as an optional input (default 10) in all these functions.
hardcode precision everywhere.
remove the changes from #271 and instead show in the tutorials and the documentation the best practice when creating f1 through division (or subtraction).

We prefer option 3. The procedure of starting with period is very common, and the problem in this issue can be solved easily, just once, in the run case. E.g. by adding the round function around the definition of f1:

end_time = 9.5 # s
f1 = round(1/end_time, 10)

cmichelenstrofer commented 10 months ago

@michaelcdevin was this a problem in any of the tests or tutorials? if so we should fix it directly there.

michaelcdevin commented 10 months ago

@michaelcdevin was this a problem in any of the tests or tutorials? if so we should fix it directly there.

It was not, I only encountered this issue in a personal debugging notebook.

Some options:

have precision as an optional input (default 10) in all these functions.

hardcode precision everywhere.

remove the changes from https://github.com/sandialabs/WecOptTool/pull/271 and instead show in the tutorials and the documentation the best practice when creating f1 through division (or subtraction).

I'm personally less favorable towards option 3. Philosophically (and at risk of being a bit pedantic), I feel this is a problem developers should address rather than making users responsible for remembering. If the root of this issue were a nuance related to ocean energy, optimization, pseudo-spectral method, etc. that's one thing, but this is fundamentally a computer science issue that users shouldn't need to worry about. Especially since we know how to fix it... it just requires a bit more coding.

Since this really only becomes an issue when using xr.merge in the post_process function, another easy solution could be to set freq_coord to use the frequency from the waves Dataset, i.e. changing this line so it uses waves.frequency instead of self.frequency.

cmichelenstrofer commented 10 months ago

@michaelcdevin you convinced me; we should address this ourselves. Of the options above then, I would prefer 2 (hardcoding). There is no reason a user should differentiate between frequencies that differ in 10th decimal place. And adding an extra precision argument to so many functions risks being confusing and unnecessary. But...

Since this really only becomes an issue when using xr.merge in the post_process function, another easy solution could be to set freq_coord to use the frequency from the waves Dataset, i.e. changing this line so it uses waves.frequency instead of self.frequency.

If we're sure this works, and that is the only place this problem shows up, I think that is the obvious solution. We'd still revert the changes in #271

michaelcdevin commented 10 months ago

If we're sure this works, and that is the only place this problem shows up, I think that is the obvious solution. We'd still revert the changes in https://github.com/sandialabs/WecOptTool/pull/271

@cmichelenstrofer testing with the original Python script in the initial message of this issue, this fix does work and doesn't seem to cause any issues with the test suite or tutorials. I'll make a quick PR implementing this to close this issue.

sandialabs / WecOptTool

BUG: xarray `MergeError` in `wec.post_process` for certain base frequencies #264