watem-sedem / rfactor

R-factor
https://watem-sedem.github.io/rfactor/
GNU Lesser General Public License v3.0
6 stars 2 forks source link

Renaming of intensity and energy functions #82

Open Sachagobeyn opened 2 months ago

Sachagobeyn commented 2 months ago

Current functions are an implementation of Verstraeten et. al. 2006 applicable to Flanders. When new functions are added, the current naming doesn't add up when new functions are suggested to the package, i.e. https://github.com/watem-sedem/rfactor/issues/80. We will first propose the rename and workflow so to integrate these functions. This is critical before adding new functions. It also requires us to fix https://github.com/watem-sedem/rfactor/issues/81

Sachagobeyn commented 2 months ago

Suggestion for rename/Addition

Rename:

Add:

Refactor:

compute_erosivity(rain, intensity_method=maximum_intensity) to compute_erosivity(rain, intensity_method=maximum_intensity, rain_energy_method=rain_energy_per_unit_depth_verstraete)

@SethCallewaert : can you review this list, I'll update in code.

SethCallewaert commented 2 months ago

Hi Sacha

Some remarks:

Sachagobeyn commented 1 month ago

Hi @SethCallewaert

Some remarks:

_- what do you mean with the maximum_intensity_interpolate with expanded docs on interpolation approach, I thought no interpolation was done by the function, and should be done in the data preparation? maybe this can be named maximum_intensity_matlabupdate (or something similar) since it is still derrived from the matlab initial code.

There is an interpolation for NaN, it is explicitely formulated in the code, see https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L185, see also https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L154. We do not interpolate on rolling, see https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html

_- for maximum_intensity_matlabrolling, I would consider removing the matlab, since this is really specific to python (pandas.rolling) and is an alternative for the matlab implemantation

ah, yes, it was a typo, it is rather maximum_intensity_python_rolling

- watch out: it's verstraeten in stead of verstraete

ok

_- I do not know if there could be multiple formulae formulated by the same author, but we could maybe add the year to the rain_energy_per_unitdepth-functions.

good punt, i'll adjust this in https://github.com/watem-sedem/rfactor/pull/94

ok, good point: I'll adjust in https://github.com/watem-sedem/rfactor/pull/94

New proposal:

SethCallewaert commented 1 month ago

Hi Sacha,

There is an interpolation for NaN, it is explicitely formulated in the code, see https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L185, see also https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L154. We do not interpolate on rolling, see https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html

Yes, indeed, my bad! Still I think it is not very good to interpolate here, since we are doing it in the data preparation. So maybe we should scratch the interpolation and make clear that if there are no values for a given period the model reads this as no precipitation was registered, since the df can not contain any 0 or nan values there is no way to distinguish between both. So if there are NaN values in the original timeline the should be interpolated before converting them into the rain_mm input dataframe, so that these are included as rain, while the original 0 values will not be counted... there is no reason to suppose that all removed values within a rainfall event must have a value >0 (and that therefor an interpolation is needed). This however will still yield a different outcome than the rolling pandas implementation, since more restrictions seem to be taken into account. The impact and the calculation of short rainfall events should also be readdressed (perhaps in different PR).

New proposal: maximum_intensity_matlab_clone > maximum_intensity_matlab_legacy with depredecate warning maximum_intensity_matlab_clone_fix -> maximum_intensity_interpolate maximum_intensity -> maximum_intensity rain_energy_per_unit_depth -> rain_energy_per_unit_depth_verstraeten rain_energy_brown_and_foster rain_energy_mcgregor

You agreed with me on the topic of adding the year and shortening the name, however none are suggested here ;) There is also twice maximum intensity, which is which? I thought we would call one maximum_intensity_python_rolling, and since i want to remove the interpolation we should think of a different name there as well, maybe referring to Renard1997 or Wichmeier (I should look for the original paper on which this approach was based).

Sachagobeyn commented 1 month ago

New proposal:

maximum_intensity_matlab_clone > same with depredecate warning
maximum_intensity_matlab_clone_fix -> same but with maximum_intensity_matlab
maximum_intensity -> maximum_intensity
rain_energy_per_unit_depth -> rain_energy_verstraeten_2006
rain_energy_brown_and_foster_1987
rain_energy_mcgregor_1995

Notes: