Closed ezmiller closed 3 years ago
@ezmiller this looks great to me.
It may be good to try this with (set! *warn-on-reflection* true)
and add type hints to avoid the performance cost:
https://clojure.org/reference/java_interop#typehints
@ezmiller this looks great to me.
It may be good to try this with
(set! *warn-on-reflection* true)
and add type hints to avoid the performance cost: https://clojure.org/reference/java_interop#typehints
@daslu thank you for these comments. I will integrate ASAP.
@daslu I'm going to merge this without the type hints just in case it's helpful to others to have it in. I'll add the type hints later.
Goal/Problem
We wanted to add some basic support for slicing datasets based on time units, using the indexing mechanism that we started working with in #1.
Proposed Solution
This PR adds some individual methods, each able to handle slicing by a specific time unit. It does not include methods for every unit of time, mostly because we expect this api to change and I wanted to get something in to unblock other development. The way these methods can be used is available in the accompanying test suite, but an example is:
Calling
slice-by-date
on a dataset that has not had an index set will not work.A key idea in the way these functions are implemented that may become central to our implementation is the expectation that the user who interacts with this library will be responsible for ensuring that the column that they choose to
index-by
matches the unit of time that they intend to slice by.In other words, this code will not work:
We may ultimately allow for more flexibility through the api, but for the moment this constraint allows us to move forward and keep things simple. One reason it may be necessary to depart from this constraint is that if we demand that users modify their datasets by adding or mutating the units of their time index, we may force the user to perform expensive (both in terms of time and space).
Next steps
My plan following this PR is to look into a technique for reduce the slicing API to a single method
slice
that can handle different time units, which matches the intention to simplify API interactions for the user that is in the spirit of the tabletablecloth
.