Closed ssolson closed 1 year ago
lol pandas released a new version last night with breaking changes for us. I am going to pin the pandas version to the previous so we can close our current PRs.
Not necessary at this time but I want to document here the issues which seem to mostly arise around the use of py 3.7. In the future, we will look to deprecate 3.7.
Okay so there is an issue I have been chasing down now regarding the hindcast tests. test_multi_loc is continually returning a 0 array for the data being passed in the test. The error returned from the test:
FAILED ::TestWPTOhindcast::test_multi_loc - ValueError: cannot reshape array of size 0 into shape (8748,29,24,1)
This is occurring inside of hypyd in a call that converts bytesToArray:
data = b'', dt = dtype('float32'), shape = (8748, 29, 24, 1)
def bytesToArray(data, dt, shape):
nelements = getNumElements(shape)
if not isVlen(dt):
# regular numpy from string
arr = np.frombuffer(data, dtype=dt)
else:
arr = np.zeros((nelements,), dtype=dt)
offset = 0
for index in range(nelements):
offset = readElement(data, offset, arr, index, dt)
if shape is not None:
if shape == () and dt.shape:
# special case for scalar array with array sub-type
arr = arr.reshape(dt.shape)
else:
> arr = arr.reshape(shape)
E ValueError: cannot reshape array of size 0 into shape (8748,29,24,1)
..\..\anaconda3\lib\site-packages\h5pyd\_hl\base.py:569: ValueError
Something has changed in our dependencies and I have not at this time I have not been able to track down what is breaking this test.
@rpauly18 there is some issue with rex causing the tests to fail.
Specifically: The parameter directional_wave_spectrum
when calling get_lat_lon_df
on the path: '/nrel/US_wave/virtual_buoy/West_Coast/West_Coast_virtual_buoy_1995.h5'
.
E.g.:
from rex import WaveX
year='1995'
region = wave.io.hindcast.region_selection((43.489171,-125.152137))
wave_path = f'/nrel/US_wave/virtual_buoy/'+region+'/'+region+'_virtual_buoy_'+year+'.h5'
with WaveX(wave_path, hsds=True) as rex_waves:
meta = rex_waves.meta
time_index = rex_waves.time_index
data_raw = rex_waves.get_lat_lon_df('directional_wave_spectrum',(43.489171,-125.152137))
I have found that switching the parameter to energy_period
will return the correct data as expected. The above example is the simplest way to reproduce the error from our testing suite. The exact error is found in test_multi_loc
when it specifically calls the hindcast module function request_wpto_directional_spectrum
.
request_wpto_directional_spectrum
uses additional kwargs in the call to WaveX but the error returned by the code above is the same.
Since you wrote this module I wanted to reach out to you to see if you had any idea to try anything I had not yet. I will go ahead and reach out to the rex team using my example above to see if they can offer any insight.
@ssolson @rpauly18 Our PR backlog continues to grow, so I have a couple suggestions for an immediate path forward:
Any thoughts on this workflow? #187, #193, #196, #197 are all reviewed and more or less ready to merge, pending some issues with tests. I'd be great to merge these PRs sometime next week using one of the above options
It would be very easy to comment out the hindcast tests the way I have set it up. The reason I had not suggested this is primarily because it means we are not testing our whole package for breaking changes from new PRs. I think your plan works for #193, #196, and #197 but since #187 uses rex it needs to be added to the serialized portion of the tests and I would not recommend it for inclusion until after getting the current issues worked out.
That makes sense about #187. I agree that we will not be fully checking PRs as they come in, but any breaking updates would be isolated to the Develop branch. This way we can keep up with development and once the tests are fixed, they should identify any outstanding issues. Can you give an update on the status of this PR for the other versioning/platform issues aside from rex?
To get your plan working for this PR I need to
I will go ahead and do that now since it is so easy.
@ssolson Thanks, when you're ready can you mark the PR ready for review? I'll find time to review this and hopefully merge a couple other PRs before I'm out of office.
Guys I have completed the follow up with Grant with is final comments and recommendations here https://github.com/NREL/rex/issues/133#issuecomment-1303685718.
...here are some observations based on what we've tried:
- Neither of us seem to be able to pull the full data volume from that dataset
- The errors are being raised by h5pyd, not rex, and appear to be saying "no data has been returned"
- Smaller datasets or a slice of the spectrum dataset are retrieved just fine
- The original code requesting the spectrum data in dataframe works fine when pointing to the Eagle datasets Given the above, I think the issue is in bandwidth limitations on the NREL HSDS API. It could be that your bandwidth limit was > recently reduced, which would appear like something changed in the code that is just now causing failures but in reality is an > external change.
My recommendations right now are to either 1) stand up your own HSDS service so you can control your data streaming or > > 2) reach out to the NREL developer network to get your bandwidth increased.
In summary, it seems that the bandwidth of the NREL HSDS was reduced recently. He suggests we set up our own HSDS which I believe is quite firmly outside the scope of this project or reach out to the NREL developer network. I think we should go with the second option. @rpauly18 I am happy to move forward with this option but wanted to check with you on if you had contacts over there.
@akeeste I expect these tests to pass and will remove from draft status when they do. I have updated the header post to this PR which captures the limitations/ compromise of the testing suite we are pushing to the develop branch.
This PR improves the testing structure of MHKit by:
Each of these recommendations was suggested to us in #183.
This PR additionally applies a temporary NetCDF version pin to the previous release to fix issues with pip installation (#177) and allow tests to pass.
An additional issue around bandwidth related NREL's HSDS API occurred while working on this issue. In order to clear out the current backlog of PRs we are closing this PR with the following limitations: