pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.07k stars 295 forks source link

Dependency tree not propagating resolution #151

Open mraspaud opened 6 years ago

mraspaud commented 6 years ago

Code Sample, a minimal, complete, and verifiable piece of code

from datetime import datetime

from satpy.scene import Scene
from satpy.utils import debug_on
import glob

debug_on()

if __name__ == '__main__':
    scn = Scene(
        sensor="viirs",
        start_time=datetime(2015, 3, 11, 11, 15),
        end_time=datetime(2015, 3, 11, 11, 30),
        filenames=glob.glob("/home/a001673/data/satellite/Suomi-NPP/viirs/lvl1b/2015/03/11/SDR/*"),

        reader="viirs_sdr")
    composite = 'true_color_lowres'
    scn.load([composite])

Add a breakpoint at the beginning of the Scene.compute method, and print out the dependency tree:

pdb> print(self.dep_tree)

Problem description

Currently, the dependency search for composites doesn't propagate the correct resolution, leading to unnecessary complication of the composite and modifier configs in the yaml files.

For example, such a yaml definition:

  rayleigh_corrected:
    compositor: !!python/name:satpy.composites.PSPRayleighReflectance
    atmosphere: us-standard
    aerosol_type: marine_clean_aerosol
    prerequisites:
    - name: M05
      modifiers: [sunz_corrected]
    optional_prerequisites:
    - satellite_azimuth_angle
    - satellite_zenith_angle
    - solar_azimuth_angle
    - solar_zenith_angle

  true_color_lowres:
    compositor: !!python/name:satpy.composites.RGBCompositor
    prerequisites:
    - name: M05
      modifiers: [sunz_corrected, rayleigh_corrected]
    - name: M04
      modifiers: [sunz_corrected, rayleigh_corrected]
    - name: M03
      modifiers: [sunz_corrected, rayleigh_corrected]
    standard_name: true_color

where the angles are defined in both M- and I-band resolution leads to the wrong resolution being chosen under certain circumstances (for example when the I-band angles are defined first in the yaml dataset list).

If no modification is made to the above yaml, the presented code crashes down the line because of not having access to the M-band resolution angles.

Expected Behaviour

I'm expecting the dependency resolution to choose the right resolution when multiple resolutions are available for some datasets but not others.

Of course, creating per-resolution modifiers fixes the problem, but I find this ugly and non-optimal.

Actual Result, Traceback if applicable

ipdb> print(self.dep_tree)
None (No Data)
 +DatasetID(name='true_color_lowres', wavelength=None, resolution=None, polarization=None, calibration=None, modifiers=None)
 + +DatasetID(name='M05', wavelength=None, resolution=None, polarization=None, calibration=None, modifiers=('sunz_corrected', 'rayleigh_corrected'))
 + + +DatasetID(name='M05', wavelength=(0.662, 0.672, 0.682), resolution=742, polarization=None, calibration='reflectance', modifiers=('sunz_corrected',))
 + + +DatasetID(name='M05', wavelength=(0.662, 0.672, 0.682), resolution=742, polarization=None, calibration='reflectance', modifiers=('sunz_corrected',))
 + + +DatasetID(name='satellite_azimuth_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='satellite_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='solar_azimuth_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + +DatasetID(name='M04', wavelength=None, resolution=None, polarization=None, calibration=None, modifiers=('sunz_corrected', 'rayleigh_corrected'))
 + + +DatasetID(name='M04', wavelength=(0.545, 0.555, 0.565), resolution=742, polarization=None, calibration='reflectance', modifiers=('sunz_corrected',))
 + + +DatasetID(name='M05', wavelength=(0.662, 0.672, 0.682), resolution=742, polarization=None, calibration='reflectance', modifiers=('sunz_corrected',))
 + + +DatasetID(name='satellite_azimuth_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='satellite_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='solar_azimuth_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + +DatasetID(name='M03', wavelength=None, resolution=None, polarization=None, calibration=None, modifiers=('sunz_corrected', 'rayleigh_corrected'))
 + + +DatasetID(name='M03', wavelength=(0.478, 0.488, 0.498), resolution=742, polarization=None, calibration='reflectance', modifiers=('sunz_corrected',))
 + + +DatasetID(name='M05', wavelength=(0.662, 0.672, 0.682), resolution=742, polarization=None, calibration='reflectance', modifiers=('sunz_corrected',))
 + + +DatasetID(name='satellite_azimuth_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='satellite_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='solar_azimuth_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, modifiers=())

Versions of Python, package at hand and relevant dependencies

Python 2.7, 3.6, satpy 7.8 or develop as of today.

djhoese commented 6 years ago

Just to be clear, I'm 90% sure the reason the I-band resolution angles are chosen is because they are higher resolutions not because they are listed in a certain order in the YAML. YAML order shouldn't matter. If it does, it should be changed.

mraspaud commented 6 years ago

@davidh-ssec hmmm, sounds plausible.

djhoese commented 6 years ago

So basically the dependency tree needs to try to get composite dependencies at the same resolution as the other inputs. If it can't find the resolution it can fall back to the highest resolution. A couple design questions:

  1. Does this happen for modifiers and composites?
  2. If for composites, the only way this makes sense is if we find the first dependency and then say we want the other dependencies at that resolution. So we want B01, B02, B03 for our composite. We get B01 at 1000m, then we look for B02 at 1000m, it is returned at 1000m even though there is a 250m available. Is that expected behavior? Is that right? I suppose if we just check for the highest resolution of each channel then the composite just has to handle that.

I can foresee issues where one composite's dependencies result in the 1000m SZA being loaded and another composite wants an unavailable 250m resolution so it will then see 1000m is already loaded and use that instead of loading the 500m resolution that would be a better choice. Maybe...not sure.

djhoese commented 6 years ago

@mraspaud Didn't we fix this already?

mraspaud commented 4 years ago

No, just found another example

ipdb> print(self.dep_tree)                                                                                
None (No Data)
 +DatasetID(name='snow', wavelength=None, resolution=None, polarization=None, calibration=None, level=None, modifiers=None)
 + +DatasetID(name='DNB', wavelength=(0.5, 0.7, 0.9), resolution=743, polarization=None, calibration='radiance', level=None, modifiers=('sunz_corrected',))
 + + +DatasetID(name='DNB', wavelength=(0.5, 0.7, 0.9), resolution=743, polarization=None, calibration='radiance', level=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=742, polarization=None, calibration=None, level=None, modifiers=())
 + +DatasetID(name='M10', wavelength=(1.58, 1.61, 1.64), resolution=742, polarization=None, calibration='reflectance', level=None, modifiers=('sunz_corrected',))
 + +DatasetID(name='I04', wavelength=(3.58, 3.74, 3.9), resolution=371, polarization=None, calibration='brightness_temperature', level=None, modifiers=('nir_reflectance',))
 + + +DatasetID(name='I04', wavelength=(3.58, 3.74, 3.9), resolution=371, polarization=None, calibration='brightness_temperature', level=None, modifiers=())
 + + +DatasetID(name='M15', wavelength=(10.263, 10.763, 11.263), resolution=742, polarization=None, calibration='brightness_temperature', level=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, level=None, modifiers=())
djhoese commented 4 years ago

In that last example, what's the expected output?

mraspaud commented 4 years ago
ipdb> print(self.dep_tree)                                                                                
None (No Data)
 +DatasetID(name='snow', wavelength=None, resolution=None, polarization=None, calibration=None, level=None, modifiers=None)
 + +DatasetID(name='DNB', wavelength=(0.5, 0.7, 0.9), resolution=743, polarization=None, calibration='radiance', level=None, modifiers=('sunz_corrected',))
 + + +DatasetID(name='DNB', wavelength=(0.5, 0.7, 0.9), resolution=743, polarization=None, calibration='radiance', level=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=742, polarization=None, calibration=None, level=None, modifiers=())
 + +DatasetID(name='M10', wavelength=(1.58, 1.61, 1.64), resolution=742, polarization=None, calibration='reflectance', level=None, modifiers=('sunz_corrected',))
 + +DatasetID(name='I04', wavelength=(3.58, 3.74, 3.9), resolution=371, polarization=None, calibration='brightness_temperature', level=None, modifiers=('nir_reflectance',))
 + + +DatasetID(name='I04', wavelength=(3.58, 3.74, 3.9), resolution=371, polarization=None, calibration='brightness_temperature', level=None, modifiers=())
 + + +DatasetID(name='I05', wavelength=(10.263, 10.763, 11.263), resolution=371, polarization=None, calibration='brightness_temperature', level=None, modifiers=())
 + + +DatasetID(name='solar_zenith_angle', wavelength=None, resolution=371, polarization=None, calibration=None, level=None, modifiers=())