nasa-jpl / autoRIFT

A Python module of a fast and intelligent algorithm for finding the pixel displacement between two images
Apache License 2.0
212 stars 52 forks source link

apply scale factor to M11,M22 matrices (rdr_off2vel_x_vec.tif) #92

Closed jhkennedy closed 11 months ago

jhkennedy commented 11 months ago

As suggested in this ITS_LIVE slack comment: https://itslive.slack.com/archives/CA1B8N740/p1688110892979159?thread_ts=1686824081.457019&cid=CA1B8N740

jhkennedy commented 11 months ago

@alex-s-gardner, @leiyangleon, with this change, do we need/want to include the scale_factor and add_offset attributes of M11,M12?

@mliukis when running the crop function locally, I'm seeing the above two attributes dropped from the cropped granules (Sentinel-1 only). Can you confirm if the cropped v2 S1 granules have those attributes?

mliukis commented 11 months ago

Edit: Yes, we want to keep them as they are used to provide simple data compression so we can store low-resolution floating-point data as small integers in the netCDF dataset


@alex-s-gardner, @leiyangleon, with this change, do we need/want to include the scale_factor and add_offset attributes of M11,M12?

* M11: https://github.com/jhkennedy/autoRIFT/blob/s1-correction/netcdf_output.py#L1115-L1116

* M12: https://github.com/jhkennedy/autoRIFT/blob/s1-correction/netcdf_output.py#L1147-L1148

@mliukis when running the crop function locally, I'm seeing the above two attributes dropped from the cropped granules (Sentinel-1 only). Can you confirm if the cropped v2 S1 granules have those attributes?

@jhkennedy both corrected and cropped S1 granules don't have these attributes. Are these enconding attributes? I am not removing any attributes when cropping the data, but have to explicitly specify the encoding attributes when writing granule to disk. I don't specify these attributes as encoding attributes...

jhkennedy commented 11 months ago

@alex-s-gardner, @leiyangleon, with this change, do we need/want to include the scale_factor and add_offset attributes of M11,M12?

* M11: https://github.com/jhkennedy/autoRIFT/blob/s1-correction/netcdf_output.py#L1115-L1116

* M12: https://github.com/jhkennedy/autoRIFT/blob/s1-correction/netcdf_output.py#L1147-L1148

@mliukis when running the crop function locally, I'm seeing the above two attributes dropped from the cropped granules (Sentinel-1 only). Can you confirm if the cropped v2 S1 granules have those attributes?

@jhkennedy both corrected and cropped S1 granules don't have these attributes. Are these enconding attributes? I am not removing any attributes when cropping the data, but have to explicitly specify the encoding attributes when writing granule to disk. I don't specify these attributes as encoding attributes...

@mliukis I'm not sure what you mean by "encoding" attributes, but when I ncdump -h the netCDF metadata of S1 granlues produced by autoRIFT, I see:

    short M11(y, x) ;
        M11:_FillValue = -32767s ;
        M11:standard_name = "conversion_matrix_element_11" ;
        M11:description = "conversion matrix element (1st row, 1st column) that can be multiplied with vx to give range pixel displacement dr (see Eq. A18 in https://www.mdpi.com/2072-4292/13/4/749)" ;
        M11:units = "pixel/(meter/year)" ;
        M11:grid_mapping = "mapping" ;
        M11:dr_to_vr_factor = 141.726547262259 ;
        M11:dr_to_vr_factor_description = "multiplicative factor that converts slant range pixel displacement dr to slant range velocity vr" ;
        M11:scale_factor = 0.0002135763f ;
        M11:add_offset = 0.007011747f ;
    short M12(y, x) ;
        M12:_FillValue = -32767s ;
        M12:standard_name = "conversion_matrix_element_12" ;
        M12:description = "conversion matrix element (1st row, 2nd column) that can be multiplied with vy to give range pixel displacement dr (see Eq. A18 in https://www.mdpi.com/2072-4292/13/4/749)" ;
        M12:units = "pixel/(meter/year)" ;
        M12:grid_mapping = "mapping" ;
        M12:dr_to_vr_factor = 141.726547262259 ;
        M12:dr_to_vr_factor_description = "multiplicative factor that converts slant range pixel displacement dr to slant range velocity vr" ;
        M12:scale_factor = 0.0002165926f ;
        M12:add_offset = 0.002233876f ;

And then when cropped with xarray, the scale_factor and add_offset attributes are missing. I think it's likely because they have the f suffix specifier (set as np.float32 in the packaging script) on them (notably, dr_to_vr_factor still exists and is a "normal" float).

mliukis commented 11 months ago

@mliukis I'm not sure what you mean by "encoding" attributes, but when I ncdump -h the netCDF metadata of S1 granlues produced by autoRIFT, I see:

@jhkennedy _FillValue, for example, is an encoding attribute in xarray which has to be set not as an attribute of the data variable but as an entry in "encoding" map parameter when writing xarray.Dataset to the file with: cropped_ds.to_netcdf(fixed_file, engine='h5netcdf', encoding=granule_encoding)

Here is how I define encoding attributes for S1, which don't include missing 2 attributes, which makes me think that those should have been specified here: https://github.com/nasa-jpl/its_live_production/blob/daeda192017aa44e02f8492d2936168f7bc3017f/src/tools/mission_info.py#L21-L35

jhkennedy commented 11 months ago

@mliukis ah, yes, those two are special netCDF attributes: https://docs.unidata.ucar.edu/netcdf-c/current/attribute_conventions.html

scale_factor

If present for a variable, the data are to be multiplied by this factor after the data are read by the application that accesses the data.

If valid values are specified using the valid_min, valid_max, valid_range, or _FillValue attributes, those values should be specified in the domain of the data in the file (the packed data), so that they can be interpreted before the scale_factor and add_offset are applied.

add_offset

If present for a variable, this number is to be added to the data after it is read by the application that accesses the data. If both scale_factor and add_offset attributes are present, the data are first scaled before the offset is added. The attributes scale_factor and add_offset can be used together to provide simple data compression to store low-resolution floating-point data as small integers in a netCDF dataset. When scaled data are written, the application should first subtract the offset and then divide by the scale factor, rounding the result to the nearest integer to avoid a bias caused by truncation towards zero.

When scale_factor and add_offset are used for packing, the associated variable (containing the packed data) is typically of type byte or short, whereas the unpacked values are intended to be of type float or double. The attributes scale_factor and add_offset should both be of the type intended for the unpacked data, e.g. float or double.

It's possible xarray just directly applied them when reading, as suggested here, and wrote out a normal float with the those applied

mliukis commented 11 months ago

@jhkennedy @alex-s-gardner looks like we have to fix S1 granules again if we want to keep these attributes. I will need to confirm that correct data is written to the file, which is most likely is :)

jhkennedy commented 11 months ago

It's possible xarray just directly applied them when reading, as suggested here, and wrote out a normal float with the those applied

Nope, in both, files, M11,M12 are written as short, so I'm not sure if they've been applied. I'll look at my two granules more closely

mliukis commented 11 months ago

Nope, in both, files, M11,M12 are written as short, so I'm not sure if they've been applied. I'll look at my two granules more closely

Probably were applied but then truncated to shorts on write

mliukis commented 11 months ago

@jhkennedy @alex-s-gardner Not good news - when trancating M11 and M12 to shorts on write without these two attributes we are basically storing zeros as original values are pretty small...

jhkennedy commented 11 months ago

Okay, that question is at least resolved; we do want to keep those attributes.

mliukis commented 11 months ago

@jhkennedy then why this affects your changes to crop the granules as you are setting the attributes?

jhkennedy commented 11 months ago

@mliukis I implemented cropping as a post-processing step following the method in your crop script. When comparing the uncropped and cropped S1 products, I noticed the attributes were missing, but hadn't grokked what the attributes were actually for, so initially thought it might be related to this change.