davidfrantz / force

Framework for Operational Radiometric Correction for Environmental monitoring
GNU General Public License v3.0
172 stars 50 forks source link

Hierachy of the interpolation module #322

Closed geo-masc closed 2 months ago

geo-masc commented 4 months ago

Hi David, hi all,

We observed a behaviour of which we assume it is either not intenionally or not sufficiently documented. This relates to the Interpolation module, which is per default set to RBF in the paramter file. But even if you opt out the output of the interpolation module (i.e., OUTPUT_TSI = FALSE) the following modules (such as STMs, UDF, etc.) will use the interpolated time series.

If this happens on purpose, I would suggest to add this to the documentation / description in the TSA PRM file. Further, I would suggest to set the default value to INTERPLOATE=NONE which I found in line 1413 in param-aux.c: 'fprintf(fp, "INTERPOLATE = RBF\n");' This would prevent users from unintentionally using interpolated data as an input to other modules.

If this behaviour is not intenionally you would probably need to make sure that the interpolation module is not called at all, if OUTPUT_TSI = FALSE. But maybe I am missing something here.

I just wanted to point it out and thus refrained from suggesting direct changes in the code.

Cheers,

Marcel

davidfrantz commented 3 months ago

Hi Marcel,

this behaviour is intended. And I believed it is also sufficiently documented.

It is quite clearly indicated in the parameter file, no? What would you suggest to make it even clearer?

# Output the Time Series Interpolation? This is a layer stack of index
# values for each interpolated date. Note that interpolation will be per-
# formed even if OUTPUT_TSI = FALSE - unless you specify INTERPOLATE = NONE.
# Type: Logical. Valid values: {TRUE,FALSE}
OUTPUT_TSI = FALSE

In general, all OUTPUT_XXX option trigger output, as well as computing of the feature. However, this does not mean that it is only computed when output is desired. It is also computed when it is needed for following steps (folding time series or phenometrics are another example of this when used in trend / CAT analyses).

In the majority of instances, interpolation should be done on the fly without needing to write the dataset to disc.

Please also see the workflow overview in the docs: image

Changing the default to NONE might be an option. But changing defaults risks breaking automatic workflows of people...

Cheers, David

geo-masc commented 3 months ago

Hi David,

Thank you for clarifying and please excuse that I did not check the workflow in the documentation, which clearly shows that this is the intended behaviour.

As a "frequent user" I usually work with the uncommented PRM files thus I did not realize the remark at the end of the interpolation block. Still, as the discussion came up in our group I have the feeling that not everyone is aware of this behaviour and that interpolation might have been used unintenionally in one or the other analyses. Although it is of course highly recommended, I don't think everyone reads all the documentation, even for modules that are not intended to be used.

Setting default to NONE would in my eyes be the easiest option. I clearly see your point with the automatic workflows, but without any feeling of how many users might affected. In any case, I would suggest moving or copying (maybe even highlighting) the quoted comment to the beginning of the interpolation module block in the PRM, which could improve visibility and hopefully prevent unintenional usage.

Cheers, Marcel

davidfrantz commented 2 months ago

Hi Marcel

I don't think everyone reads all the documentation what should I say ;)

However, with some thought, I changed the default to NONE a minute ago (dev)

Cheers, David