boutproject / xBOUT

Collects BOUT++ data from parallelized simulations into xarray.
https://xbout.readthedocs.io/en/latest/
Apache License 2.0
21 stars 10 forks source link

Compatibility with xarray>=2022.9.0 #276

Open johnomotani opened 1 year ago

johnomotani commented 1 year ago

Following this comment https://github.com/boutproject/xBOUT/issues/213#issuecomment-1397686005, converting a few members of Region from scalar DataArray to float reduces the number of errors.

WIP because there are still a few errors in the unit tests, and performance is worse - for some unit tests between 1.5x and 2x slower.

Fix #275.

codecov-commenter commented 1 year ago

Codecov Report

Merging #276 (a39b729) into master (280a570) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #276   +/-   ##
=======================================
  Coverage   69.01%   69.01%           
=======================================
  Files          15       15           
  Lines        3208     3208           
  Branches      790      790           
=======================================
  Hits         2214     2214           
  Misses        732      732           
  Partials      262      262           
Impacted Files Coverage Δ
xbout/region.py 83.55% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

johnomotani commented 1 year ago

I think with this PR the errors are gone with the latest xarray versions, but a significant slow-down is still there. I suspect the culprit is in the region operations - the tests in test_region.py seem to take ~2x as long (in some cases? I haven't checked the whole list carefully), and those operations are used several times in other operations that have slowed down (like interpolate_parallel() and various 2d plotting methods).

xarray made a lot of changes to how they handle 'indexes' recently. I wonder if the newest versions end up creating 'indexes' where they were not created before, adding overhead that way. If so, maybe we can try and avoid that happening in cases where it is not necessary? [E.g. in region.py it might not be necessary to create indexes for the guard-cell DataArrays in _concat_*_guards(), if the indexes can just be used from the main array?]