roocs / clisops

Climate Simulation Operations
https://clisops.readthedocs.io/en/latest/
Other
21 stars 9 forks source link

Subset without masking - accept shapely polygons #337

Closed aulemahal closed 4 months ago

aulemahal commented 4 months ago

Pull Request Checklist:

What kind of change does this PR introduce?:

Two new features:

Does this PR introduce a breaking change?:

No. The default value of the new argument is True, which retains the previous behaviour.

Other information:

aulemahal commented 4 months ago

For illustration. This is a section of a Rotated Pole grid (sftlf of RDRS):

before

Shown are a polygon and a box, both of course defined using lat-lon coordinates. The plots are all using the grid's Rotated Pole projection.

Result of subset_bbox with and without mask_outside: after-bbox

Result of subset_shape with and without mask_outside: ( I also added the lat-lon bounding box of the shape) after-shape

coveralls commented 4 months ago

Pull Request Test Coverage Report for Build 8711359080

Details


Changes Missing Coverage Covered Lines Changed/Added Lines %
clisops/core/subset.py 16 18 88.89%
<!-- Total: 16 18 88.89% -->
Totals Coverage Status
Change from base Build 8708308516: 0.009%
Covered Lines: 1783
Relevant Lines: 2462

💛 - Coveralls
huard commented 4 months ago

What is the rationale for using this instead of subset_bbox ?

aulemahal commented 4 months ago

Additions to subset_bbox from this PR are there when one wants to extract a subset of a rotated grid but doesn't want to dig to find the actual rlon/rlat to be used in a sel. This allows one to specify the approximate region using regular coordinates but without the ugly-looking NaNs.

Additions to subset_shape follow the same idea but for the case when we want to extract the region around a polygon of interest.

The second image shows the actual thing I was wanting to get done. I decided of a region for my work and I can easily specify it in lat/lon coordinates because I know this part of the world well. But my goal wasn't to extract the exact box, just to have a smaller dataset. And its for the evaluation of a model, so wasting pixels like this is useless.

huard commented 4 months ago

I guess I'm slightly uncomfortable because the area "outside" is not something I've worried at all, as long as it contained the shape I was interested in. What I mean is that I don't think we've tested the results for consistency. I seem to remember using heuristics to take a wider area around the shape, to make sure I'm including everything. If this heuristic selection is now part of the API, it may require some deeper review.

Isn't what you want essentially shape_bbox_indexer with a better name, or a higher level function that also accepts a bbox?

aulemahal commented 4 months ago

Indeed, this could be done with shape_bbox_indexer.

However, I'm not sure I understand your opinion :

[...] the area "outside" is not something I've worried at all, as long as it contained the shape I was interested in.

That's also my case, which is why I want a way not to mask the area outside. I just want the subset that contains the shape/box I'm interested in.

EDIT: at the end of the day, I want something wrappable by xscen. This addition of kwargs makes it easy, but we could also implement this as another high-level function here or as an option to spatial.subset in xscen, I don't really care.

aulemahal commented 4 months ago

After discussion with David, I will close this PR and reopen a simpler one only for the addition of shapely polygons as inputs of subset_shape, with maybe some other improvements to shape_bbox_indexer, which can be used for the goal I was trying to achieve here.