Closed Bartdoekemeijer closed 1 year ago
This looks good @Bartdoekemeijer , going to add to the 1.4 milestone,
Thanks for reviewing!
- Why the line that made sure "time" is dropped is commented out
If the dataframe is large, the call reset_index
slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed that time
should be just a column in df
, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.
- The switch to only interpolating the "pow" column (not the others in varnames)
Are you referring to this line? We are still interpolating for each variable. The point is that when you call interpolate.RegularGridInterpolator
, it calculates a triangulation between all the data points so that it can easily interpolate. This takes quite a bit of time for larger data tables. In the old code, we unnecessarily recalculate this triangulation for each varname. Now, knowing the triangulation is the same, I just updated the to-be-interpolated values through this line. The results should be identical between the old and new code, but the new code should be significantly faster.
If the dataframe is large, the call
reset_index
slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed thattime
should be just a column indf
, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.
This is great I think, when I was originally mapping more of the code to polars, this was one of the changes I had to make (polars only uses simple indexes). I think a standard where time is just an index is a good one, it will also make grabbing polars speed up more broadly a smaller step.
If the dataframe is large, the call
reset_index
slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed thattime
should be just a column indf
, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.This is great I think, when I was originally mapping more of the code to polars, this was one of the changes I had to make (polars only uses simple indexes). I think a standard where time is just an index is a good one, it will also make grabbing polars speed up more broadly a smaller step.
@paulf81 you mean a standard where time is just a column, right?
Are you referring to this line? We are still interpolating for each variable. The point is that when you call
interpolate.RegularGridInterpolator
, it calculates a triangulation between all the data points so that it can easily interpolate. This takes quite a bit of time for larger data tables. In the old code, we unnecessarily recalculate this triangulation for each varname. Now, knowing the triangulation is the same, I just updated the to-be-interpolated values through this line. The results should be identical between the old and new code, but the new code should be significantly faster.
Ok that makes sense, figured it must be something like that but just couldn't quite connect the dots, thanks @Bartdoekemeijer! Proceeding to merge.
If the dataframe is large, the call
reset_index
slows the code down. I am now tackling a large dataset and found that this helps in speeding up the code. Also, I think we agreed thattime
should be just a column indf
, not the index. So I guess I am forcing conventions a bit. Happy to revert if you experience issues with it.This is great I think, when I was originally mapping more of the code to polars, this was one of the changes I had to make (polars only uses simple indexes). I think a standard where time is just an index is a good one, it will also make grabbing polars speed up more broadly a smaller step.
@paulf81 you mean a standard where time is just a column, right?
yes, sorry!!
This PR is ready to be merged.
Feature or improvement description This PR adds a new function in
floris_tools
that allows users to apply Gaussian smearing on theirdf_fi_aprox
in the wind direction axis. The results should be identical with FLORIS'UncertaintyInterface
withfix_yaw_in_relative_frame=True
, but then much faster because this is purely a post-processing step.Related issue, if one exists N/A
Impacted areas of the software
floris_tools
Additional supporting information This PR allows the rapid generation of FLORIS solutions from an existing precalculated dataset with different values for
wd_std
. This can be helpful when tuning a model while including various levels ofwd_std
.There are also a couple of small bug fixes in the code. Not the cleanest way of pulling them in like this, but figured it'll do the job okay since it's in the same functions/files.
Test results, if applicable Here is a minimal example:
And produces: