Closed akeeste closed 6 months ago
I'm currently working through a few issues with getting the tests to pass. There is a minor bug in the standard deviation calculation in loads.general.bin_statistics
.
The original code uses pandas input. In bin_statistics
the Pandas input, data
, is sliced by both signal_name
and binnumber
:
However, one issue with this set-up--If a signal within a Pandas DataFrame is sliced and only has one value (i.e. for i=22 with the test data), the result is a Pandas Series, not another Pandas DataFrame. The std of a pandas Series returns a float, whereas the std of a pandas DataFrame returns a DataFrame of length 1. This results in data that should have std=0 to error in line 68 and results in std=nan on Line 70 instead. The original data being tested against was created with this error and led to the test failing. I updated the test data to reflect the correct standard deviation (0) for those bins with only one data point.
df = pd.DataFrame([1., 2.])
s = pd.Series([1., 2.])
print(df.std())
print(s.std())
print(df.std()[0])
print(s.std()[0])
Instead of calculating the standard deviation manually, I suggest we use scipy.stats.binned_statistic
with statistic='std', just as we do in line 59 above.
@ssolson This PR is ready for review.
@ssolson This PR is ready for review.
Thanks Adam, I will not be able to complete my review of this till Wednesday this week due to travel. I will tackle as soon as I can.
Thanks @ssolson I'll go through the review and make updates early next week.
TODO from discussion with @ssolson today:
also
@ssolson I addressed your review comments and added the option for a user to specify the time dimension for >1D variables. But I don't know a robust way to verify that the user's specified dimension is for example a timeseries, so I think we should leave that for another update.
The user is also limited to inputting a single variable for some functions, like mler_coefficients
where the user should really only input 1 wave spectrum at a time.
Thanks @ssolson. I'll make those changes then merge this PR
@akeeste the CDIP Threads Server seems to be down right now. The test will not pass until it is back up. https://thredds.cdip.ucsd.edu/
Threads is back up.
Loads/general
Loads/extreme