NREL / flasc

A rich floris-driven suite for SCADA analysis
https://nrel.github.io/flasc/
BSD 3-Clause "New" or "Revised" License
31 stars 18 forks source link

Feature: add gaussian blending to floris solutions in postprocess step (similar to UncertaintyInterface but in post-) #114

Closed Bartdoekemeijer closed 1 year ago

Bartdoekemeijer commented 1 year ago

This PR is ready to be merged.

Feature or improvement description This PR adds a new function in floris_tools that allows users to apply Gaussian smearing on their df_fi_aprox in the wind direction axis. The results should be identical with FLORIS' UncertaintyInterface with fix_yaw_in_relative_frame=True, but then much faster because this is purely a post-processing step.

Related issue, if one exists N/A

Impacted areas of the software floris_tools

Additional supporting information This PR allows the rapid generation of FLORIS solutions from an existing precalculated dataset with different values for wd_std. This can be helpful when tuning a model while including various levels of wd_std.

There are also a couple of small bug fixes in the code. Not the cleanest way of pulling them in like this, but figured it'll do the job okay since it's in the same functions/files.

Test results, if applicable Here is a minimal example:

import numpy as np
from matplotlib import pyplot as plt

from flasc.floris_tools import (
    calc_floris_approx_table,
    add_gaussian_blending_to_floris_approx_table,
)
from flasc.examples.models import load_floris_artificial as load_floris

if __name__ == "__main__":
        # Load FLORIS object
        fi, _ = load_floris()

        # Get FLORIS approx. table
        df_fi_approx = calc_floris_approx_table(
            fi,
            wd_array=np.arange(0.0, 360.0, 3.0),
            ws_array=[8.0],
            ti_array=[0.08],
        )

        fig, ax = plt.subplots()
        ax.plot(df_fi_approx["wd"], df_fi_approx[[f"pow_{ti:03d}" for ti in range(7)]].sum(axis=1), label="Normal")

        # Apply Gaussian blending
        for wd_std in [1.0, 3.0, 10.0, 20.0]:
            df_fi_approx_gauss = add_gaussian_blending_to_floris_approx_table(df_fi_approx, wd_std=wd_std)
            ax.plot(df_fi_approx_gauss["wd"], df_fi_approx_gauss[[f"pow_{ti:03d}" for ti in range(7)]].sum(axis=1), label=f"Gaussian smeared (wd_std={wd_std:.1f} deg)")

        ax.grid(True)
        ax.set_xlabel("Wind direction (deg)")
        ax.legend()
        plt.show()

And produces: image

paulf81 commented 1 year ago

This looks good @Bartdoekemeijer , going to add to the 1.4 milestone,

Bartdoekemeijer commented 1 year ago

Thanks for reviewing!

  • Why the line that made sure "time" is dropped is commented out

If the dataframe is large, the call reset_index slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed that time should be just a column in df, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.

  • The switch to only interpolating the "pow" column (not the others in varnames)

Are you referring to this line? We are still interpolating for each variable. The point is that when you call interpolate.RegularGridInterpolator, it calculates a triangulation between all the data points so that it can easily interpolate. This takes quite a bit of time for larger data tables. In the old code, we unnecessarily recalculate this triangulation for each varname. Now, knowing the triangulation is the same, I just updated the to-be-interpolated values through this line. The results should be identical between the old and new code, but the new code should be significantly faster.

paulf81 commented 1 year ago

If the dataframe is large, the call reset_index slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed that time should be just a column in df, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.

This is great I think, when I was originally mapping more of the code to polars, this was one of the changes I had to make (polars only uses simple indexes). I think a standard where time is just an index is a good one, it will also make grabbing polars speed up more broadly a smaller step.

misi9170 commented 1 year ago

If the dataframe is large, the call reset_index slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed that time should be just a column in df, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.

This is great I think, when I was originally mapping more of the code to polars, this was one of the changes I had to make (polars only uses simple indexes). I think a standard where time is just an index is a good one, it will also make grabbing polars speed up more broadly a smaller step.

@paulf81 you mean a standard where time is just a column, right?

misi9170 commented 1 year ago

Are you referring to this line? We are still interpolating for each variable. The point is that when you call interpolate.RegularGridInterpolator, it calculates a triangulation between all the data points so that it can easily interpolate. This takes quite a bit of time for larger data tables. In the old code, we unnecessarily recalculate this triangulation for each varname. Now, knowing the triangulation is the same, I just updated the to-be-interpolated values through this line. The results should be identical between the old and new code, but the new code should be significantly faster.

Ok that makes sense, figured it must be something like that but just couldn't quite connect the dots, thanks @Bartdoekemeijer! Proceeding to merge.

paulf81 commented 1 year ago

If the dataframe is large, the call reset_index slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed that time should be just a column in df, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.

This is great I think, when I was originally mapping more of the code to polars, this was one of the changes I had to make (polars only uses simple indexes). I think a standard where time is just an index is a good one, it will also make grabbing polars speed up more broadly a smaller step.

@paulf81 you mean a standard where time is just a column, right?

yes, sorry!!