Open ajelenak opened 4 years ago
Here's my opinions on terms:
"Subsampled coordinate" implies for me that we're "thinning" coordinates, which can be the case but is not always the case, as we are sometimes interpolating pixel centres from provided corner coordinates.
I get the message here - that a "subsampled" coordinate does not need to correspond to exact location of one of the full (post-interpolation) coordinates. I don't follow the example, though: I not sure what "corner coordinates" are. Is the conical scanner case an example of this?
Thanks!
Restored vs. full vs. complete vs. upsampled: I advocate using the term "full coordinates" to discuss the coordinates that result from interpolating as we advise the user, for much the same reason that I advocate using the term "compaction" to describe the reduction of the number of coordinates we provide.
Perhaps the result from interpolating should be "un" the name of the provided values, which is analogous to (un)packed and (un)compressed - terms which are already in use. I.e. if we were to call the values in the file "compacted", then result of interpolation could be "uncompacted".
The specific example I had in mind was the VIIRS case - there the tiepoints correspond to the corners of interpolation groups. This example demonstrates a lot of the advantages:
A, B, C, and D are the corners of the interpolation zone. The grid spanned between them is split into individual pixels, whose centres are dots. By specifying how the tie-points are offset from the closest pixel corners and how to bend the lines AB and CD, AD and BC respectively, you can reconstruct all of those dots from A, B, C, D and the parameterization.
Thus A, B, C, and D don't belong to the original, "full" set of coordinates.
By specifying different offsets of course we could use tie-points whose positions are collocated with the centres of their corresponding pixels. Using the corners is nice because it allows us to store one set of tie-points that can be applied to observations from multiple instruments that observe on similar grids but at different resolutions (in the case of VIIRS these are the M- and I-bands).
Perhaps the result from interpolating should be "un" the name of the provided values, which is analogous to (un)packed and (un)compressed - terms which are already in use. I.e. if we were to call the values in the file "compacted", then result of interpolation could be "uncompacted".
Lots of wisdom there, I've edited my comment to say
- Uncompacted vs. restored vs. full vs. complete vs. upsampled: Originally, I advocated using the term "full coordinates" to discuss the coordinates that result from interpolating as we advise the user, for much the same reason that I advocate using the term "compaction" to describe the reduction of the number of coordinates we provide. However, @davidhassell notes that "un"-compacting coordinates would correspond with current vocabulary. The term feels a bit clunky in the mouth but I prefer precision to aesthetics.
We're considering this agreed and adopting this into the proposal's terms.
@erget - could you re-e-mail the zoom link? Thanks!
Wasn't clear for me towards the end of today's meeting, should this issue be reopened to discuss the specific attribute names using the adopted "compact" base term or should there be separate issue?
I had understood this issue to refer to general terminology issues - are you referring to namespacing or a similar concern?
I am thinking of the next step, which would be to apply the adopted base term "compact" to actual new attribute names. Is this what you call "namespacing"?
I think so. Let's see with some examples:
compact_dimension
compact_tie_point
in order to use the same prefix, thus making them easier to findIs this what you mean?
I was thinking of the "compaction" more as a description of the overall process, not necessarily as a word that would appear in variable or attribute names.
See also my comment here: https://github.com/erget/subsampled-coordinates/issues/6#issuecomment-637571722
No, this issue was not for just the term describing the overall process but for the term to apply in new attribute names.
OK. I'd be fine with using interpolation_
as a prefix, if you think it makes sense in light of our describing the overarching process as "compaction".
@ajelenak I agree the issue is about the actual attribute names, but we will also have some terms describing the overall process, without these becoming part of actual attribute names.
I think we converged earlier on using interpolation
as the equivalent of grid_mapping
Probably the value of interpolation
is that it describes what a user has to do to a compact product to get an uncompacted product. If we replaced interpolation
altogether with compaction
or compact
in the attribute names, they would be less descriptive for a user that receives the product. The compaction has already taken place, generating the compact file.
Don't know if that makes sense.
Another thought occurs to me - strictly speaking we're not always interpolating.
In the HDF-EOS case you might have a full set of coordinates that extends beyond the "corners" of the "compacted" coordinates, as hinted at here:
The same applies to the cases for microwave imagers where a single set of coordinates is provided, and these coordinates are used in order to extrapolate the coordinates of the other channels, as here:
Therefore I think that we'll need to think about this some more. I don't want to change the terminology when we're converging on the final blueprint for next week's presentation but it's an open issue in my opinion, as interpolation is simply the more common case here, but we want to accommodate extrapolation as well.
Off the top of my head, this smells of "regridding"
What about "tie_point_gridding" or just "gridding"?
Gridding is a good word as it would cover both the processes of compacting and uncompacting as well as interpolation and extrapolation.
Tie points are a key element of the method and would make it more accurate and descriptive. Gridding alone would be conveniently short.
The full draft vocabulary currently used in the examples NDVI_lat_lon_Example, NDVI_grid_mapping_Example and VIIRS_M_and_I_Band_Example includes the following attribute names, based on the word interpolation
:
In the data variable:
interpolation
interpolation_indices
interpolation_offsets
In the *_indices variables
interpolation_dimension
In the container variable
interpolation_name
interpolation_coefficients
interpolation_flags
location_tie_points
sensor_direction_tie_points
solar_direction_tie_points
lunar_direction_tie_points
time_tie_points
If we choose tie_point_gridding
the full vocabulary could be:
In the data variable:
tie_point_gridding
tie_point_gridding_indices
tie_point_gridding_offsets
In the *_indices variables
tie_point_gridding_dimension
In the container variable
tie_point_gridding_name
tie_point_gridding_coefficients
tie_point_gridding_flags
location_tie_points
sensor_direction_tie_points
solar_direction_tie_points
lunar_direction_tie_points
time_tie_points
If we choose gridding
the full vocabulary could be:
In the data variable:
gridding
gridding_indices
gridding_offsets
In the *_indices variables
gridding_dimension
In the container variable
gridding_name
gridding_coefficients
gridding_flags
location_tie_points
sensor_direction_tie_points
solar_direction_tie_points
lunar_direction_tie_points
time_tie_points
I lean toward tie_point_gridding
to be absolutely explicit, is that too clunky?
There are at least two different naming schemes for new attributes currently in use. Since the number of these new attributes seems to be stabilizing, now would be a good time to pick a base term from which to derive new attribute names.
These two terms have been used so far: subsample and interpolate. Additional options: reduce, restore.
A related issue is the term tie point. It is somewhat specific to remote sensing. Perhaps anchor is a more generic term but equally applicable?
Other naming suggestions are welcome!