Open bockthom opened 4 months ago
I have done a prototype implementation here and some tests here. Especially, let me know whether sliding windows and cumulative ranges are mutually, I would need to slightly update my implementation in that case.
Edit: Also let me know if this addition fits for you with my currently open wish-wash PR or if we should wait for a new one.
I have done a prototype implementation
The implementation looks good to me (except for two typos/inconsistencies).
and some tests
The structure of the tests looks good, but I did not have time yet to find out whether the behavior in the tests is correct or not.
Especially, let me know whether sliding windows and cumulative ranges are mutually
I've seen that you have already tests for the combination of cumulative ranges and sliding windows - but just from looking at the tests I cannot judge whether such a combination is useful or not. Could you please post a small example directly showing how the ranges look like in such a case?
Also let me know if this addition fits for you with my currently open wish-wash PR or if we should wait for a new one.
If the implementation stays as small as it is currently, I'd go for adding it to your "open wish-wash PR". But let's discuss this tomorrow.
Regarding sliding window ranges and our recent discussion. sliding.windows
in construct.ranges
is differs in some way from what we understand by sliding.windows
in splitting.
sliding.windows
, that means that we on top split the previously defined regions each into 2 halves and connect each second one.construct.ranges
, when sliding.windows
is specified, we assume that the input revisions are the result of splitting based on sliding.windows
, and therefore, we do not split them in halves again, we just do the connecting every second one stuff.Now regarding the cumulative ranges that means the following (example):
We want to split data into the following bins: 2016-01-01 - 2017-01-01
, 2017-01-01 - 2018-01-01
, and 2018-01-01 - 2019-01-01
. We also specify sliding.windows = TRUE
and therefore in the end receive network(-split)s that have the following bounds: 2016-01-01 - 2017-01-01
, 2016-07-01 - 2017-07-01
, 2017-01-01 - 2018-01-01
, 2017-07-01 - 2018-07-01
and 2018-01-01 - 2019-01-01
. (which is also exactly the output of construct.ranges(..., sliding.window = TRUE)
.).
When we construct ranges and specify to construct cumulative ranges, all resulting ranges start with the start of the earliest range, i.e., the resulting ranges would be 2016-01-01 - 2017-01-01
, 2016-01-01 - 2017-07-01
, 2016-01-01 - 2018-01-01
, 2016-01-01 - 2018-07-01
and 2016-01-01 - 2019-01-01
.
Taking everything into account, I think cumulative sliding-window ranges may be as useful as cumulative regular ranges, depending on the use case, but im not entirely sure ^^
Ok, let's keep the case to construct cumulative ranges for sliding-window ranges. (I don't think that this will be actually used; but, in general, the resulting ranges look reasonable).
Description
In coronet, we have a function
construct.ranges
that takes a list of revisions and creates range names out of it, as in the following example:This function is able to construct sliding-window ranges, but not to construct cumulative ranges.
We have a dedicated function
construct.cumulative.ranges
, but this function has a completely different interface (it takes a start date, an end date, and a time period), similar toconstruct.consecutive.ranges
andconstruct.overlapping.ranges
. However, the functionconstruct.ranges
itself (which takes just a vector of dates) is not capable of constructing cumulative.Therefore, I suggest to enhance the function
construct.ranges
by an additional parameter to construct cumulative ranges, or ––if adding a new parameter introduces more problems than benefits––also an additional function might be helpful - but then we have the problem of naming conflicts with the existing functions. So, I'd be glad if we find a suitable way to enhance the existing functionconstruct.ranges
.Desired output for
construct.ranges
with cumulative ranges:Motivation
Constructing ranges in a cumulative way is particularly useful when analyzing commit-interaction data, but also in many other use cases. In general, enhancing the currently existing function would provide an easy way to construct range-data objects cumulatively by simply passing a list of fixed bins to the range-construction function, and passing the resulting ranges to
split.data.time.based.by.ranges
afterwards.