Closed chengzhuzhang closed 1 week ago
Possible next steps:
flox
with e3sm_diags and xCDAT temporal APIs Regarding to results: Case2,3,4 are identical, while, xcdat result from Case 1 is slightly off.
Case 1: [1.63433977e-08 1.73700556e-08 2.73745702e-08 3.22052784e-08
3.28640795e-08 3.27481651e-08 3.03053831e-08 2.27138450e-08
2.60270063e-08 2.38527367e-08 1.89776266e-08 1.71358785e-08]
Case 2: [1.63593346e-08 1.73546146e-08 2.73492017e-08 3.22492671e-08
3.28317165e-08 3.28051981e-08 3.02749046e-08 2.27307623e-08
2.59688303e-08 2.38724820e-08 1.89765019e-08 1.71450951e-08]
Case 3: [1.63593346e-08 1.73546146e-08 2.73492017e-08 3.22492671e-08
3.28317165e-08 3.28051981e-08 3.02749046e-08 2.27307623e-08
2.59688303e-08 2.38724820e-08 1.89765019e-08 1.71450951e-08]
Case 4: [1.63593346e-08 1.73546146e-08 2.73492017e-08 3.22492671e-08
3.28317165e-08 3.28051981e-08 3.02749046e-08 2.27307623e-08
2.59688303e-08 2.38724820e-08 1.89765019e-08 1.71450951e-08]
Do the runtimes you captured include I/O?
Do the runtimes you captured include I/O?
Yes, it includes everything from imports
to the end.
I believe I've found the root cause for the most major performance bottleneck:
My initial solution:
I just opened PR #689 to explore and address this issue.
Thank you for looking into this! It looks like we should be able to also get some performance improvement following your solution for E3SM Diags refactored code, which we can address later..
Is your feature request related to a problem?
This may not be a high priority issue, but I think it is worthwhile to document here: When refactoring e3sm_diags with xcdat, I was working on using temporal.climatology operation to get annual cycle of a data stream which has 1hourly data for 30 years (ntime = 262800) from 1 grid point. Using xcdat is slower than the native function from e3sm_diags. It is probably expected, because the xcdat code is more sophisticated in reading in, writing out the data. I just noticed that it seems any operations on xarray.dataarray is much slower than numpy arrays, so the native e3sm_diags climo function needs to be optimized accordingly to get better performance.
Describe the solution you'd like
We can check if there are places where it is possible to get performance gain
Describe alternatives you've considered
No response
Additional context
Attaching example code and performance data.