Closed paigem closed 2 years ago
It might be helpful if @rabernat could look at the 'shrink'/'unshrink' logic and see if there are performance improvements that we have not considered?
Maybe we should profile the memory use of this?
I did some quick profiling to compare runtime comparing with and without the mask. This preliminary comparison shows a speed up of about 20% by masking land values.
This is not a complete comparison. The following assumptions are made:
(3600, 2700, 2)
(CM2.6-sized with 2 time steps)ecmwf
algorithm (completely arbitrary)
coare3p0
and saw similar results (~20% speedup)noskin()
so fardata = create_data((3600, 2700, 2), chunks=None)
%snakeviz out_data = noskin_nomask(*data, 'ecmwf', 2, 10, 6)
We can see that the numpy wrapper noskin_np()
takes essentially the entire runtime.
data = create_data((3600, 2700, 2), chunks=None, land_mask=True)
%snakeviz out_data = noskin(*data, 'ecmwf', 2, 10, 6)
The small rectangles in the bottom right are the extra work needed to shrink and unshrink the arrays. Even with these extra computations, the runtime is noticeably shorter.
# Take average runtime across 3 runs
# No mask
nomask_avg = np.mean((60.467731952667236,61.46433997154236,61.295777797698975))
# Mask
mask_avg = np.mean((49.637826919555664,46.95607113838196,48.179039001464844))
total_percent_faster = (nomask_avg - mask_avg)/nomask_avg
total_percent_faster
Result: 0.2098748237283278 --> ~20% faster!
As @jbusecke stated above, we may also want to profile memory usage.
Thats amazing @paigem! Thanks for the nice comparison.
I had one more thing, that might be useful to change here. I believe now that we are converting the input to a 1d array in any case, we can get rid of this wrapper entirely.
To check that this is correct, I would write a test that puts arrays of various dimensions (1-4D?) through the xarray wrappers and make sure that this does not lead to crashes.
@jbusecke Take a look at the new test I wrote in this commit to verify that our xarray wrapper takes arrays of size 2d and greater (i.e. no longer just 3d!). It seems to work fine, but I wasn't sure how to write a test for that...
I made a few minor suggestions. If you agree with those, you can commit and merge once the tests pass (make sure to double check that the % line goes to 100 in the CI log).
This PR does the following:
NaN
)Additionally:
threadsafe
for xarray, skin computations (see #39 for more discussion on ways around this)create_data()
from #37test_flux_np.py
create_test_data.py
Closes #31