pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
346 stars 94 forks source link

Spherical refactoring #420

Open ghiggi opened 2 years ago

ghiggi commented 2 years ago

This PR provides a refactoring of spherical.py, introduces new methods and solves long-standing issues. This PR is required for a future PR addressing the missing methods/option for cropping/slicing of SwathDefinition and AreaDefinition. It is also likely that pyorbital and pytrollschedule can benefit of this PR.

The refactoring of spherical.py tries to stick to the naming/conventions employed by the shapely library, which is commonly used to perform planar geometrical computation. The PR will introduce backward incompatibilities!

Here below I summarize briefly the content of this very long (sorry!) PR. It implement 6 main classes:

SPoint

SArc

SLine

SPolygon/SMultiPolygon

SExtent

I didn't implement yet carefully tests as I wait for your opinion before proceeding. But I used the code to run intersection/union and get MODIS and GPM swath slices with GEO AreaDefinition over 1 year of data and everything looks fine ;) As a note of caution, in case of polygons encloses the poles, the SExtent of the SPolygon is currently wrongly calculated as I didn't manage to implement a function that checks if a point (i.e. the pole) is contained in the polygon. I AM LOOKING FOR YOUR HELP WITH THIS.

I am sorry for this long PR landing on your head, I hope you can forgive me @mraspaud @djhoese @gerritholl.

mraspaud commented 2 years ago

@ghiggi thanks a lot for putting all the effort! Some preliminary remarks:

As such, I strongly suggest you set this PR aside, and start making instead small incremental PRs, adding at most a couple of features/bug fixes at the time, with accompanying tests and documentation. The changes you propose here look very promising, and it would be a pity they don't get through because of having too much to chew at once...

gerritholl commented 2 years ago

No need to ask for forgiveness for what looks like great work toward solving some long-standing problems, but I second @mraspaud that smaller bites help to avoid choking. Ten small PRs would be easier to handle than one big one. Would it be possible to split it?

djhoese commented 2 years ago

If the change to a more shapely-compatible set of names is where we want to go (I'm ok with that) then I'd suggest either:

  1. If possible add new methods to the existing classes while keeping the old ones.
  2. Create a whole new class (ex. SpherePolygon or something) that the user has to opt-in to using and switching.

This way we can put deprecation warnings on all the old stuff and eventually remove it. Heck you could even put it in the future subpackage of pyresample where all the pyresample 2.0 stuff is supposed to go.

Without reading the actual changes, I worry about testing for this. Some of the tests that were added for the existing interfaces were made because of edge cases with the implementation. Verifying that a new interface doesn't suffer from those or similar issues might be difficult.

In addition to agreeing with the many smaller PRs idea, I'd also suggest thinking about making multiple commits. This entire PR is one commit. That's 2000 lines of code where you never had a breaking/pause point. Separate commits make it easier to review your changes and issues you've run into as you develop something so we can work on it. You could even consider Test Driven Development (TDD) where you right tests first and then write the user code that makes that test pass. It may make it easier to make smaller commits and pull requests this way too since you'd be dividing work into user use cases.

ghiggi commented 2 years ago

I definitely understand your concerns and I could start to do small incremental PRs instead of this huge one. But it would be almost impossible to perform these PRs to spherical.py. I modified spherical.py in this PR with the intention of providing you with an overview of where the main changes happen and discussing your opinions.

I am aware that it exists also the old spherical_geometry.py file, but if you agree with the need for the refactoring I propose, I think it would be nice if we add a new file called spherical_future.py (or let me know the name) where I start adding incremental PRs that you can review progressively. I could proceed in the following order:

  1. Add SPoint
  2. Add SArc
  3. Add SLine
  4. Add SExtent
  5. Add SPolygon and SMultiPolygon

Regarding the integration of such classes defined in spherical_future.py into the current workflow, I would foresee the following:

When the above will be in place, I will be able to add the PR to enhance get_area_slices also for SwathDefinition. I come up with a quite efficient implementation that can return in about 30 seconds the intersection slices of 1 day of LEO orbits with a GEO disk, with a custom specification of the minimum_overlap_fraction required to define the intersection (see for value 0.9 and 0 in the below pictures) .

min_overlap_0 9 min_overlap_0

Let me know what you think and the path forward !!

djhoese commented 2 years ago

Others may disagree with this, but here is my gut reaction in order to:

  1. Not waste your time @ghiggi separating things out in a way that might not help reviewing that much anyway. I know this contradicts some of what I said before, but separating PRs by object type will make it harder to see the overall picture even if it does reduce the individual PR review time. I don't see a way to separate it out any other way unless you see a way to submit individual bug fixes to the existing interfaces and then make a separate PR for the new interfaces (this one would still be large).
  2. Put things in a place that makes future changes easier and require less worrying about backwards compatibility.

I think you should:

  1. Put your new spherical geometry in pyresample/future/spherical/ (https://github.com/pytroll/pyresample/tree/main/pyresample/future). You already have these shapely utils, but are they used by any other module? I'd say put them in pyresample/future/spherical/shapely_utils.py or something like that. You could go Java-style and put each spherical object in its own spherical/ submodule, but maybe there is a more consolidated structure that would work (ex. primitives.py for lines, points, and arcs; polygons.py for extent and polygons).
  2. Put tests in a pyresample/test/test_futures/test_spherical.py (or test_spherical sub-directory).
  3. Any future development/changes to geometry objects (SwathDefinition, AreaDefinition, etc) should be done in the pyresample/future module and be in line with existing Pyresample 2.0 design concepts (#181, https://github.com/pytroll/pyresample/milestone/3). Although as I say that I realize none of those issues include the decisions/discussions we had made about geometry objects...oops.

Questions:

  1. Are these individual classes meant to be seen or used by the user? Or are only the highest ones needed by the user?
  2. What time zone are you in?
  3. Can you meet with the pyresample devs to discuss some of these concepts so we are all on the same page about what's going on?
  4. Question for everyone, if we are putting this in future, should we make more drastic changes to the spherical polygon API?

We appreciate all you've done and I think these will all be really good improvements for pyresample. Thanks for helping and hopefully we can get these things figured out sooner rather than later.

ghiggi commented 2 years ago

I agree with all that you suggest !

Answering already your question:

  1. If working from Area/SwathDefinition objects, the import of the classes will not be necessary because SPolygon and SLine object could be derived by area_def.boundary().line() and area_def.boundary().polygon() But I think people might be interested in accessing some of the high classes with something like from pyresample.spherical import SPolygon, SLine, SPoint to perform operations that currently shapely does not enable. As an example, with less than 10 lines of code is possible to simulate/generate SwathDefinition objects using pyorbital satellite orbits.
  2. I live in Switzerland, so CET. Not a problem on my side to also meet in the evenings. Next week I would be available unfortunately only Monday (preferably) or Thursday ... when you prefer ...

Let me know @mraspaud @djhoese @gerritholl. Have a nice weekend.

mraspaud commented 2 years ago
  • So, I think that for starting, we can do a copy of current spherical.py into futures/spherical.py. After this first merge, I can start to do incremental PR to this file, so that you can easily track the changes, and you can easily see the point at which it breaks the compatibility.

No, please do not duplicate code! Make calls to existing functionality from functions with new names, but please do not copy...

djhoese commented 2 years ago

If there are bug fixes that can be applied to the existing interfaces :+1:. Make those PRs. I see you already made one for the inplace operations in the intersection code.

Otherwise, for these new interfaces, start a PR where you take what you've done here (which seems almost entirely new code, right?) and put it in pyresample/future/spherical/polygons.py. Add imports to pyresample/future/spherical/__init__.py to support the imports you are hoping for (I would have suggested this anyway). Make a commit, push, and start the PR from here. You could even just modify this existing one if you really want. Then add tests with inspiration from the existing tests and any other information you know since you've developed this (what are the edge cases, what new functionality is possible by the user). Make multiple commits during this step and push. Then start refactoring into multiple modules if it makes sense. If you or anyone else doesn't see the benefit of this then we can move it all to pyresample/future/spherical.py.

Given that this seems like a new and separate interface from the existing functionality (correct me if I'm wrong) then starting from the point of the existing interfaces doesn't have a huge benefit in my eyes. So I say just make the new modules and I'll review it when it is ready.

djhoese commented 2 years ago

@ghiggi I talked with Martin today in a video call about this (we had the video call about something else, but then talked about this PR). I thought this PR was more a complete rewrite than it was so I have some suggestions. Let me know what you think.

I think you could make a lot of little PRs out of this PR. I'm not entirely sure as I write this where everything should go. Possible PRs (with tests!):

  1. Each bug fix that doesn't require any of the other functionality should be a separate PR (for each bug fix).
  2. Each class marked as (NEW) in your description should be a separate PR (one PR per class...with tests).
  3. All the renames should be done in one single PR. If possible subclassing or the creation of base classes should be used to preserve the old class while the new class can be updated in later PRs with additional functionality. If needed an old class's __init__ could be updated with a DeprecationWarning saying to use one of the new classes.
  4. Additional methods/functionality that are related should be in separate PRs. For example, one for __contains__/within/intersection, one for plotting, etc. If any of this can be done from a base class for all spherical classes that would be great. Like a central def plot method. Or make plot a separate utility function outside of the classes (Martin wanted this, I'm less sure).
ghiggi commented 2 years ago

Some quick considerations/thoughts:

In my mind I plan to commit 1/2 day per week on these PRs, to give you time to review.

I will also start parallelly to add incremental small PR for the Boundary retrieval and SwathDefinition methods. I am not sure I see how to add new methods in /future/geometry/swath (monkey patching?). Maybe I will need an example ;)

At a given point, we should discuss also how to implement a function to test if a point is contained in a polygon. This will be required to enable the API to work on all edge cases. I couldn't figure out which maths are required ...

djhoese commented 2 years ago

That sounds like it should work. For the future SwathDefinition, we'd have to make a subclass but I'm not sure that is even the right approach yet. As long as it doesn't break backwards compatibility, could you define new methods or change existing methods in the current SwathDefinition that use the future spherical stuff?

ghiggi commented 2 years ago

@mraspaud @djhoese. It's a long time since the last iterations on this refactoring. Unfortunately, I never found the time to advance and implement the tests ...
I was thinking ... if I come to the PCW meeting in November in Athens, do you think it could make sense to address (part of) this this refactoring during the week? I think that in-person discussion will facilitate the design and implementation of the required PRs. Let me know ;-)

mraspaud commented 2 years ago

It will definitely help, yes! Live discussions, but also dedicated time will allow you to work on this efficiently.

ghiggi commented 2 years ago

In such a case, would you agree to have a meeting already in mid of October to briefly discuss again the structure of the PRs? In this way, I could already do some preparatory PRs to be efficient as possible during the week in Athens.

mraspaud commented 2 years ago

Works for me!

ghiggi commented 1 year ago

Hey @mraspaud. Do you have time in the week between the 17 and 21 October? Do @djhoese want to participate too? Have a nice week :)

djhoese commented 1 year ago

I'd be OK joining any conversations, but given the time zone differences don't let my schedule stop you from meeting.

Edit: It may be best to have this conversation on slack.