Open dannyparsons opened 5 years ago
I just came across this, but I think this is timely since we've been updating the calculation system to have .drop
and .preserve
. I've assumed so far to keep .drop = TRUE
and .preserve = FALSE
by default, because that is how R-Instat was working before these parameters were created. However, I think it would be good to discuss this.
@rdstern do you have any strong thoughts on this?
@lilyclements good question. Originally the dialog was contradictory, in that it did .drop=TRUE, but the disabled control indicated otherwise. I prefer to keep it that way. My reasoning is partly because I am very happy that our summarise works fine, even when the by variables are not factors. (Users suffer with Genstat, which is much stricter concerning factor variables. We really simplified tutorial 2 when Danny realised we don't have to bother users to make year a factor for some operations, while considing it as numeric for others. So I was a bit concerend that we had to bother about again with the start of the rains stuff - I don't think we do now. So our default gives the same result whether the by variables are factors, date, or numeric. .drop=FALSE is a bonus that is available to us, when all the by variables are factors!
Hope you agree. If so, then @rachelkg I think we could make that point in the help?
dplyr::filter
has gained a.preserve
argument which can preserve original groupings after filtering.Example,
or now with
.preserve = TRUE
, we keep all the years in the summary..preserve = TRUE
solves our issues in climatic summaries like start of rains where we filter out every day in the year but would still like to report the start of rains asNA
.But should this be the default for our calculation system? You may not want this, for example, if you are also filtering on the years to get less rows in the summary table.