pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.05k stars 289 forks source link

parameterised composite naming #2644

Open gerritholl opened 10 months ago

gerritholl commented 10 months ago

Feature Request

Is your feature request related to a problem? Please describe.

Several composites in Satpy exist in several variants with labels such as "raw", "corrected", or "uncorrected" attached to the name. Some examples:

https://github.com/pytroll/satpy/blob/5ecf1abefdfcd17fb3dd41ddcaed53499bf8f7c9/satpy/etc/composites/vii.yaml#L26

(this one nevertheless has sza_correction enabled)

https://github.com/pytroll/satpy/blob/5ecf1abefdfcd17fb3dd41ddcaed53499bf8f7c9/satpy/etc/composites/fci.yaml#L70

https://github.com/pytroll/satpy/blob/5ecf1abefdfcd17fb3dd41ddcaed53499bf8f7c9/satpy/etc/composites/fci.yaml#L81

https://github.com/pytroll/satpy/blob/5ecf1abefdfcd17fb3dd41ddcaed53499bf8f7c9/satpy/etc/composites/fci.yaml#L92

https://github.com/pytroll/satpy/blob/5ecf1abefdfcd17fb3dd41ddcaed53499bf8f7c9/satpy/etc/composites/ahi.yaml#L296

https://github.com/pytroll/satpy/blob/5ecf1abefdfcd17fb3dd41ddcaed53499bf8f7c9/satpy/etc/composites/visir.yaml#L221

Some problems with this approach:

There are several corrections or modifications that might be applied to composites, so "uncorrected" or "corrected" is inherently ambiguous. Does this refer to Rayleigh correction, hybrid green correction, SZA correction, limb correction? Naively, one might think that "uncorrected" means none of those corrections have been applied, but as the true_color_uncorrected above shows, this is not the case. And even then, "uncorrected" is about as meaningful as an "unfiltered" photo (looking at you, Instagram™), in that there is no such thing as an uncorrected satellite image when we use L1.5 data as input.

Describe the solution you'd like

I'm not sure what the best solution would look like exactly, but maybe we need a way to load parameterised composites, such that the user can do:

scene.load(["true_color[rayleigh_correction=true,green_correction=none]", "true_color[rayleigh_correction=true,green_correction=ndvi]")

or maybe using DataIDs is better:

scene.load([DataQuery("true_color", rayleigh_correction=True, green_correction=None]), DataQuery("true_color", rayleigh_correction=True, green_correction="ndvi")])

and then Satpy would use a well-defined and well-documented system to find the corresponding YAML definitions.

The unparameterised versions such as a basic .load(["true_color"]) would map to defined default parameters, just like today.

Describe any changes to existing user workflow

Existing names would map to specific definitions that would be identical to the status quo, so backward compatibility can be maintained.

Additional context

In the documentation, we could then automatically generate all composite combinations with their definitions, so that it's transparent for users what's going on.

djhoese commented 10 months ago

Another case like this would be different resolutions of dependencies going into a composite (I'm a user and I want this composite to use the 500m band 2 even though the 250m band 2 is available) or even the "corrections" applied to those individual dependencies.

Theoretically this situation is supposed to be covered by the ability to specify "modifiers"on theScene.load` call, but in reality this has never been tested or well defined for composites. So you should be able to do (again, in theory):

scn.load([DataQuery(name="my_rgb", modifiers=("sunz_corrected",))])

But I don't think this works in any sense because the RGB doesn't have those corrections/modifiers, its dependencies do. And there are some composites where corrections only apply to some dependencies (ex. rayleigh doesn't make sense for certain wavelengths) so it is "accurate" to say that a composite is "rayleigh_corrected" if it isn't applied to all inputs?

So back to theory: I'm not saying the above is the best solution and it doesn't handle all cases (described below), but it gives us a way forward. I think theoretically (yep, I'm using that word again) with a few code updates in the dependency tree and composite YAML loading stuff you'd be able to do the above line of code and have variants of the composite defined in the YAML with a modifiers: ["sunz_corrected"] and another one with modifiers: ["sunz_corrected", "rayleigh_corrected"] and the Scene loading would know the difference. Note that this modifiers: is on the composite and the dependencies could theoretically have whatever modifiers you want.

Cases not handled: In the best case users would be able to customize low-level kwargs like sunz correction angle thresholds. With the above suggestion you'd have to have a different defined modifier in YAML for each variation you want to support (88 degrees, 90 degrees, 92 degrees, etc).

One other complication: I forget what the Scene (or rather the DatasetDict) does when you request a generic product name (ex. "C01") but there exists modified version and unmodified versions in the same collection/Scene. Like if there's C01 with no modifiers, C01 with sunz, and C01 with sunz + rayleigh and you say scn["C01"] I forget what it gives you. If it gives you the least modified version then this would be annoying for composites. If it gives you the most modified version then it might just work.