pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.07k stars 295 forks source link

Add 'to_yaml' method to base compositor class #514

Open djhoese opened 5 years ago

djhoese commented 5 years ago

Is your feature request related to a problem? Please describe. While reviewing @pnuu's new compositor documentation, I was thinking it would be much easier to document the YAML part of configuration if the base compositor had a to_yaml method that returned a YAML string that could be placed in a configuration file.

Describe the solution you'd like A to_yaml instance method or maybe a classmethod depending on what it does or is it possible to do both (staticmethod?)? If called from an instance then the specified dependencies provided could be directly entered in to the YAML. If called for a class then the dependencies could have temporary names for dependencies. Optionally use the inspect module to figure out how many required dependencies there are.

Describe any changes to existing user workflow No, only additional functionality.

Additional python or other dependencies No.

Describe any changes required to the build process None.

Describe alternatives you've considered A separate function, but that may be overkill. Different names for the method should be considered for consistency. This functionality could also be used for writers, enhancements, and maybe readers. @mraspaud what is the name of the method in pyresample?

mraspaud commented 5 years ago

It's AreaDefinition.create_areas_def in pyresample.

mraspaud commented 5 years ago

Removing the milestone on this one, as it keeps moving forward

IamRaviTejaG commented 5 years ago

@djhoese @mraspaud I would like to work on this issue.

I understand that this is a feature request for a to_yaml() method which would return data (from within a compositor class) in YAML format, or would preferably write it to a YAML file. Is this correct?

Also, please let me know where to find the Compositor class in the codebase, so I can know where I can get started.

mraspaud commented 5 years ago

@IamRaviTejaG Welcome ! The preference is to output a YAML string for now. The base compositor class is here: https://github.com/pytroll/satpy/blob/master/satpy/composites/__init__.py#L272 and the GenericCompositor which is probably easiest to start with is here: https://github.com/pytroll/satpy/blob/master/satpy/composites/__init__.py#L777

The different compositor classes are documented here: https://satpy.readthedocs.io/en/latest/composites.html Exemple of a yaml file using these: https://github.com/pytroll/satpy/blob/master/satpy/etc/composites/visir.yaml#L93

So, for example, with the first example in the documentation:

>>> from satpy.composites import GenericCompositor
>>> compositor = GenericCompositor("overview")
>>> composite = compositor([local_scene[0.6],
...                         local_scene[0.8],
...                         local_scene[10.8]])

we should be able to do

composite.to_yaml()

and get something like

  overview:
    compositor: !!python/name:satpy.composites.GenericCompositor
    prerequisites:
    - name: VIS06
      wavelength: 0.6
     ...
    - name: VIS008
      wavelength: 0.8
      ...
    - name: IR_108
      wavelength: 10.8
      ...
    standard_name: overview

depending on the the data/satellite you are working with.

The next step is nested compositors like these:

  dust:
    compositor: !!python/name:satpy.composites.GenericCompositor
    prerequisites:
    - compositor: !!python/name:satpy.composites.DifferenceCompositor
      prerequisites:
      - 12.0
      - 10.8
    - compositor: !!python/name:satpy.composites.DifferenceCompositor
      prerequisites:
      - 10.8
      - 8.7
    - 10.8
    standard_name: dust

Hope it helps !

IamRaviTejaG commented 5 years ago

@mraspaud Thanks for the detailed info. From what I understand from this snippet:

>>> from satpy.composites import GenericCompositor
>>> compositor = GenericCompositor("overview")
>>> composite = compositor([local_scene[0.6],
...                         local_scene[0.8],
...                         local_scene[10.8]])

We first initialize a GenericCompositor instance with the name overview. And then we pass an DataArray into it. Does this get passed into the CompositeBase class's kwargs[prerequisites]? I'm a little confused as to how the everything flows. Please help me with this.

  1. To implement an to_yaml() inside the GenericCompositor class, I need access to the name, which directly goes into the CompositeBase class via L#791. So do I store it in a local variable there as self.currentInstanceName = name somewhere around L#784 or is it directly available via any variable? How do I get the passed values like:
    [local_scene[0.6], local_scene[0.8], local_scene[10.8]]

    Also, please link me to resources about local_scene and other prerequisites and their names.

djhoese commented 5 years ago

@mraspaud This brings up a good point. Short answer: maybe to_yaml takes the input DataArrays (just like the call method does) and uses their metadata (name, wavelength, etc) to figure out what to put in the YAML.

@IamRaviTejaG As an explanation: Normally when we create composites from the YAML configuration the prerequisites information gets passed to the Compositor's __init__ method and stored as self.prerequisites (I think). These are then used to figure out what prerequisites should be loaded to make that composite. So YAML like:

  overview:
    compositor: !!python/name:satpy.composites.GenericCompositor
    prerequisites:
    - name: VIS006
    - name: VIS008
    - name: IR_108
    standard_name: overview

would create a compositor by doing comp = GenericCompositor('overview', prerequisites=['VIS006', 'VIS008', 'IR_108']). This compositor instance then has self.name set to 'overview' and self.prerequisites set to that list of strings. The rest of Satpy (specifically the Scene object) will look at that prerequisites property and load the necessary data to make the composite. Once loaded the data will be passed to the compositor object like comp((vis006_data_arr, vis008_data_arr, ir_108_data_arr)).

Hopefully that clears some of this up. I'm not sure what the best solution is.

mraspaud commented 5 years ago

@djhoese small correction: comp = GenericCompositor('overview')(['VIS006', 'VIS008', 'IR_108']) right ?

mraspaud commented 5 years ago

@IamRaviTejaG GenericCompositor is a subclass of CompositeBase, so attribute from the latter should be available in the former.

djhoese commented 5 years ago

No. The compositor instance is made (at least when using the contents of a YAML file) with the __init__ call to GenericCompositor that I specified above. The __call__ method is only used when generating the composite DataArray from the input data.

When using compositors interactively it isn't necessary to specify the names of the prerequisites because the user is manually going to pass the proper arguments (the ones they want) to the __call__ method.

Edit:

# under the hood in Scene when loading from YAML
comp = GenericCompositor('overview', prerequisites=['VIS006', 'VIS008', 'IR_108'])
my_new_data = comp((scn['VIS006'], scn['VIS008'], scn['IR_108']))
# interactively
scn = Scene(...)
scn.load(['VIS006', 'VIS008', 'IR_108'])

comp = GenericCompositor('overview')
my_new_data = comp((scn['VIS006'], scn['VIS008'], scn['IR_108']))

Edit 2: The prerequisites is a kwarg in __init__. I've fixed both of my previous comments.