Open Sachagobeyn opened 2 months ago
Suggestion for rename/Addition
Rename:
Add:
Refactor:
compute_erosivity(rain,
intensity_method=maximum_intensity) to compute_erosivity(rain, intensity_method=maximum_intensity, rain_energy_method=rain_energy_per_unit_depth_verstraete)
@SethCallewaert : can you review this list, I'll update in code.
Hi Sacha
Some remarks:
what do you mean with the maximum_intensity_interpolate with expanded docs on interpolation approach, I thought no interpolation was done by the function, and should be done in the data preparation? maybe this can be named maximum_intensity_matlab_update (or something similar) since it is still derrived from the matlab initial code.
for maximum_intensity_matlab_rolling, I would consider removing the matlab, since this is really specific to python (pandas.rolling) and is an alternative for the matlab implemantation
watch out: it's verstraeten in stead of verstraete
I do not know if there could be multiple formulae formulated by the same author, but we could maybe add the year to the rain_energy_per_unit_depth-functions.
maybe we can shorten the name of the rain_energy_per_unit_depth to rain_energy, seems rather long without much purpose. I do not think it will be confusing for the user if the per_unit_depth is removed, but it is also just the laziness talking for having to type the whole function ;)
Hi @SethCallewaert
Some remarks:
_- what do you mean with the maximum_intensity_interpolate with expanded docs on interpolation approach, I thought no interpolation was done by the function, and should be done in the data preparation? maybe this can be named maximum_intensity_matlabupdate (or something similar) since it is still derrived from the matlab initial code.
There is an interpolation for NaN, it is explicitely formulated in the code, see https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L185, see also https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L154. We do not interpolate on rolling, see https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html
_- for maximum_intensity_matlabrolling, I would consider removing the matlab, since this is really specific to python (pandas.rolling) and is an alternative for the matlab implemantation
ah, yes, it was a typo, it is rather maximum_intensity_python_rolling
- watch out: it's verstraeten in stead of verstraete
ok
_- I do not know if there could be multiple formulae formulated by the same author, but we could maybe add the year to the rain_energy_per_unitdepth-functions.
good punt, i'll adjust this in https://github.com/watem-sedem/rfactor/pull/94
ok, good point: I'll adjust in https://github.com/watem-sedem/rfactor/pull/94
New proposal:
Hi Sacha,
There is an interpolation for NaN, it is explicitely formulated in the code, see https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L185, see also https://github.com/watem-sedem/rfactor/blob/master/src/rfactor/rfactor.py#L154. We do not interpolate on rolling, see https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html
Yes, indeed, my bad! Still I think it is not very good to interpolate here, since we are doing it in the data preparation. So maybe we should scratch the interpolation and make clear that if there are no values for a given period the model reads this as no precipitation was registered, since the df can not contain any 0 or nan values there is no way to distinguish between both. So if there are NaN values in the original timeline the should be interpolated before converting them into the rain_mm input dataframe, so that these are included as rain, while the original 0 values will not be counted... there is no reason to suppose that all removed values within a rainfall event must have a value >0 (and that therefor an interpolation is needed). This however will still yield a different outcome than the rolling pandas implementation, since more restrictions seem to be taken into account. The impact and the calculation of short rainfall events should also be readdressed (perhaps in different PR).
New proposal: maximum_intensity_matlab_clone > maximum_intensity_matlab_legacy with depredecate warning maximum_intensity_matlab_clone_fix -> maximum_intensity_interpolate maximum_intensity -> maximum_intensity rain_energy_per_unit_depth -> rain_energy_per_unit_depth_verstraeten rain_energy_brown_and_foster rain_energy_mcgregor
You agreed with me on the topic of adding the year and shortening the name, however none are suggested here ;) There is also twice maximum intensity, which is which? I thought we would call one maximum_intensity_python_rolling, and since i want to remove the interpolation we should think of a different name there as well, maybe referring to Renard1997 or Wichmeier (I should look for the original paper on which this approach was based).
New proposal:
maximum_intensity_matlab_clone > same with depredecate warning
maximum_intensity_matlab_clone_fix -> same but with maximum_intensity_matlab
maximum_intensity -> maximum_intensity
rain_energy_per_unit_depth -> rain_energy_verstraeten_2006
rain_energy_brown_and_foster_1987
rain_energy_mcgregor_1995
Notes:
Current functions are an implementation of Verstraeten et. al. 2006 applicable to Flanders. When new functions are added, the current naming doesn't add up when new functions are suggested to the package, i.e. https://github.com/watem-sedem/rfactor/issues/80. We will first propose the rename and workflow so to integrate these functions. This is critical before adding new functions. It also requires us to fix https://github.com/watem-sedem/rfactor/issues/81