IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 102 forks source link

Fixed Water Balance Bug in End of Season Dialogue #8916

Closed MeSophie closed 3 months ago

MeSophie commented 6 months ago

Fixes #8910 @rdstern @lilyclements I corrected the Water Balance code when using variable with reducing unchecked in End of season dialogue. Please have a look.

rdstern commented 6 months ago

@MeSophie I started here with the simplest case for dodoma and got this error:

image

MeSophie commented 6 months ago

@rdstern I Fixed the bug. @lilyclements I wondering what changes the new code will bring. I've tried the new version and the old version of the code with CIMH climate data and found the same result.

image image

On the other side, ticking the reducing box gives the following error:

image image

I don't know whether this is due to a fault on my part.

lilyclements commented 6 months ago

@MeSophie this error is occurring because of NA values in the evaportranspiration variable. Can you try replacing them with a different value (e.g., value from above, in the Prepare > Data Frame > Replace Values dialog), and then see how it works?

MeSophie commented 5 months ago

@MeSophie this error is occurring because of NA values in the evaportranspiration variable. Can you try replacing them with a different value (e.g., value from above, in the Prepare > Data Frame > Replace Values dialog), and then see how it works?

@lilyclements It works well. Thank you. @rdstern Over to you now. Could you please test the dialogue again?

rdstern commented 5 months ago

a) @MeSophie could you please send me the data you are using - with the missing values initially? Not through github, but only to me, as they are nervous about sharing data. b) @lilyclements just to add that maybe this could also be a good data example to illustrate current facilities in R-Instat for infilling missing values. a) Your copy from above, then b) calculator with alternative methods, from zoo, in the transform keyboard, then also Climatic > Check Data > Infill dialog! c) We hope to produce a new version - on the web if ok - once the evapotranspiration and reducing evaporation are both ok. There have been many improvements since the last released version.

rdstern commented 5 months ago

@MeSophie I think you may be almost there now - at least in this dialog. But I still can get an error: image

I tried progresively for Dodoma as follows, and with no problem initially:

a) Constant evaporation not reducing, from 1 March. Ok b) Evap column - constant value of 5, not reducing. Ok, and same answers, as expected. c) Evap column - and reducing - from fraction 0.5, dowen to 5 - still ok! Hooray! d) Repeat, but change down to, to 10. Gives error.

Tried again and something is not clearing presumably, because with evap column and reducing, the second time you run - even changing nothing - it gives a problem.

I have not tried with multiple stations, e.g. Ghana data!

MeSophie commented 5 months ago

@rdstern Could you please test the new changes? Thank you.

rdstern commented 5 months ago

@MeSophie here is still the problem. I now get no problem with Dodoma with constant evaporation or a variable. I can also repeat the calculation and get the same answers. I am also ok with the reducing evaporation - down to 5, reducing from 0.5, the first time I run, and the answers look sensible.

Returning to the dialog and repeating with reducing gives this error:

image

What's interesting however is that I now went back to the non-reducing - and that was ok.

Then I tried reducing again, and that was ok. Then I changed the threshold to 10 (from 5) and that was ok too. But when I repeated the 10 - so changed nothing, then it complains. It seems that if I change something, then it is ok!!!

I have now also tried with Ghana. I kept trying different options and there must also still be a problem with the reducing calculations. I guess this will become a @lilyclements question? Here is an example for the Ghana data:

image

This gives quite a lot of early missing years at both sites. I assumre it may be because of missing values in the rainfall data. We should be treating missing values in the rainfall identically, whether, or not there is reducing evaporation. There is nothing missing in evaporation, because it is 5mm every day.

@MeSophie could you also fix a minor problem in the defaults for the controls. If I start with End of the Rains and then move to the End of the Season, all is ok as the defaults for the end of the season.
But, if I run the End of the Rains first, and then go to the End of the Season, here is what I get:

image

It has remembered the day range - which is good. But the capacity is now 1, rather than 100. And ther lower limit is now 0.00, rather than 0.50.

Thanks

Roger

MeSophie commented 5 months ago

@rdstern Could you please test the dialogue again?

rdstern commented 5 months ago

@MeSophie I note the change in the dialog, from rains to season, is now fixed. But the results from the simplest use of the season are odd. This is also the sort of thing I would like you to be checking yourself. It is imp[ortant that the results are correct. Here is the simplest use of the dialog:

image

Do you notice something odd about the water-balance results? I wqonder why?

I checked quickly using an old version - namely Version 0.7.6. Here they are there:

image

If you don't have the old version installed, then you might like to try the Transform dialog - which (I hope) is working ok, if reducing evaporation isn't used.


Now I have continued and tried all the options for Dodoma. @MeSophie you have done well, as it all seems to work now:

image

Moreover, the results from the reducing option look sensible!!! Great. Perhapse are nearly through with this?

I assume also the R-code for the reducing evaporation in the Transform dialog,whould be similar, but considerably simpler, because there is no summary stage. Once that's working we should also be able to use it to check these results.

MeSophie commented 4 months ago

@rdstern I made some small change on the code here according to Lily comment on PR #8935. Please have a look.

rdstern commented 4 months ago

@MeSophie and @lilyclements I have not got as far as reducing evaporation yet. I tried for dodoma with end of the rains, which looks fine. Then I went to end of the season as follows:

image

You will see that the date for the end of the season is the same as the end of the rains - and in every year! This can't be, because the end of the rains has at least 10mm, so can't be close to zero on the same day.

Here is the code: I can't see anywhere that it refers to the 0.05, which is the trigger for the end of the season.

# Dialog: End of Rains/Season

rain_min <- instat_calculation$new(type="calculation", function_exp="ifelse(test=is.na(x=rain), yes=0, no=rain)", result_name="rain_min", calculated_from=list("dodoma"="rain"))
wb_min <- instat_calculation$new(type="calculation", function_exp="purrr::accumulate(.f= ~ pmin(pmax(..1 + ..2, 0), 100), .x=tail(x=rain_min - 5, n=-1), .init=0)", result_name="wb_min", sub_calculations=list(rain_min))
rain_max <- instat_calculation$new(type="calculation", function_exp="ifelse(yes=100, test=is.na(x=rain), no=rain)", result_name="rain_max", calculated_from=list("dodoma"="rain"))
wb_max <- instat_calculation$new(type="calculation", function_exp="purrr::accumulate(.f= ~ pmin(pmax(..1 + ..2, 0), 100), .x=tail(x=rain_max - 5, n=-1), .init=0)", result_name="wb_max", sub_calculations=list(rain_max))
wb <- instat_calculation$new(type="calculation", function_exp="ifelse(test=(wb_min != wb_max) | is.na(x=rain), yes=NA, no=wb_min)", result_name="wb", sub_calculations=list(wb_min, wb_max))
conditions_filter <- instat_calculation$new(type="filter", function_exp="(wb) | is.na(x=rain)", sub_calculations=list(wb))
grouping_by_station_year <- instat_calculation$new(type="by", calculated_from=list("dodoma"="year"))
doy_filter <- instat_calculation$new(type="filter", function_exp="doy_366 >= end_rains & doy_366 <= 182", calculated_from=calc_from_convert(x=list(dodoma="doy_366", dodoma_by_year=c("end_rains"))))
end_season <- instat_calculation$new(type="summary", function_exp="ifelse(test=is.na(x=dplyr::first(x=wb)), yes=NA, no=dplyr::first(x=doy_366))", result_name="end_season", calculated_from=list("dodoma"="doy_366"), save=2)
end_of_season_combined <- instat_calculation$new(type="combination", manipulations=list(conditions_filter, grouping_by_station_year, doy_filter), sub_calculations=list(end_season))
data_book$run_instat_calculation(calc=end_of_season_combined, display=FALSE)
rm(list=c("end_of_season_combined", "conditions_filter", "wb", "wb_min", "rain_min", "wb_max", "rain_max", "grouping_by_station_year", "doy_filter", "end_season"))
MeSophie commented 4 months ago

@rdstern I am trying to reproduce the error please could you give me more information about the day range that you used. from the code you provided I can see that the difference between your code and mine is the day range. Without any change on day range from my side everything is fine.

image

And you are right the was 0.5 on the code I removed it by mistake, I will put it back.

MeSophie commented 3 months ago

@rdstern Finaly I was able to reproduce your day range This is my result it true the is a lot of missing value by the results look different. The 0.5 is back on the code.

image


# Dialog: End of Rains/Season

rain_min <- instat_calculation$new(type="calculation", function_exp="ifelse(test=is.na(x=rain), yes=0, no=rain)", result_name="rain_min", calculated_from=list("dodoma"="rain"))
wb_min <- instat_calculation$new(type="calculation", function_exp="purrr::accumulate(.f= ~ pmin(pmax(..1 + ..2, 0), 100), .x=tail(x=rain_min - 5, n=-1), .init=0)", result_name="wb_min", sub_calculations=list(rain_min))
rain_max <- instat_calculation$new(type="calculation", function_exp="ifelse(yes=100, test=is.na(x=rain), no=rain)", result_name="rain_max", calculated_from=list("dodoma"="rain"))
wb_max <- instat_calculation$new(type="calculation", function_exp="purrr::accumulate(.f= ~ pmin(pmax(..1 + ..2, 0), 100), .x=tail(x=rain_max - 5, n=-1), .init=0)", result_name="wb_max", sub_calculations=list(rain_max))
wb <- instat_calculation$new(type="calculation", function_exp="ifelse(test=(wb_min != wb_max) | is.na(x=rain), yes=NA, no=wb_min)", result_name="wb", sub_calculations=list(wb_min, wb_max))
conditions_filter <- instat_calculation$new(type="filter", function_exp="(wb <= 0.5) | is.na(x=rain)", sub_calculations=list(wb))
grouping_by_station_year <- instat_calculation$new(type="by", calculated_from=list("dodoma"="year"))
doy_filter <- instat_calculation$new(type="filter", function_exp="doy_366 >= end_rains & doy_366 <= 182", calculated_from=calc_from_convert(x=list(dodoma="doy_366", dodoma_by_year=c("end_rains"))))
end_season <- instat_calculation$new(type="summary", function_exp="ifelse(test=is.na(x=dplyr::first(x=wb)), yes=NA, no=dplyr::first(x=doy_366))", result_name="end_season1", calculated_from=list("dodoma"="doy_366"), save=2)
end_of_season_combined <- instat_calculation$new(type="combination", manipulations=list(conditions_filter, grouping_by_station_year, doy_filter), sub_calculations=list(end_season))
data_book$run_instat_calculation(calc=end_of_season_combined, display=FALSE)
rm(list=c("end_of_season_combined", "conditions_filter", "wb", "wb_min", "rain_min", "wb_max", "rain_max", "grouping_by_station_year", "doy_filter", "end_season"))
# Dialog: End of Rains/Season

roll_sum_rain <- instat_calculation$new(type="calculation", function_exp="RcppRoll::roll_sumr(x=rain, n=1, fill=NA, na.rm=FALSE)", result_name="roll_sum_rain", calculated_from=list("dodoma"="rain"))
conditions_filter <- instat_calculation$new(type="filter", function_exp="(roll_sum_rain > 10) | is.na(x=roll_sum_rain)", sub_calculations=list(roll_sum_rain))
grouping_by_station_year <- instat_calculation$new(type="by", calculated_from=list("dodoma"="year"))
doy_filter <- instat_calculation$new(type="filter", function_exp="doy_366 >= end_rains & doy_366 <= 182", calculated_from=calc_from_convert(x=list(dodoma="doy_366", dodoma_by_year=c("end_rains"))))
end_rains <- instat_calculation$new(type="summary", function_exp="ifelse(test=is.na(x=dplyr::last(x=roll_sum_rain)), yes=NA, no=dplyr::last(x=doy_366))", result_name="end_rains1", calculated_from=list("dodoma"="doy_366"), save=2)
end_of_rains_combined <- instat_calculation$new(type="combination", manipulations=list(conditions_filter, grouping_by_station_year, doy_filter), sub_calculations=list(end_rains))
data_book$run_instat_calculation(calc=end_of_rains_combined, display=FALSE)
rm(list=c("end_of_rains_combined", "conditions_filter", "roll_sum_rain", "grouping_by_station_year", "doy_filter", "end_rains"))
lilyclements commented 3 months ago

The 0.5 is back on the code.

@MeSophie nice! Does this mean this problem is now fixed?

MeSophie commented 3 months ago

The 0.5 is back on the code.

@MeSophie nice! Does this mean this problem is now fixed?

@lilyclements It is better to let @rdstern confirm it.