scicloj / tablecloth.time

Tools for the processing and manipulation of time-series data in Clojure.
Other
18 stars 1 forks source link

Update adjust-interval api to use auto-index detection #47

Closed ezmiller closed 2 years ago

ezmiller commented 3 years ago

Note: This PR should go in after #45, if it is approved.

Goal / Problem

The adjust-interval function had an API that was designed for when we were anticipating not knowing yet how we would get the index. We know that now because we have the index on the column in tmd.

Proposed Solution

This PR does the following to the adjust-interval API:

So usage now is like this:

(def data 
  (tablecloth/dataset {:A [#time/date "1970-01-01" #time/date "1970-01-02"
                             #time/date "1970-02-01" #time/date "1970-02-02"]
                         :B ["foo" "foo" "foo" "bar"]
                         :C (range 4)}))

data
;; => _unnamed [4 3]:
;;    |         :A |  :B | :C |
;;    |------------|-----|---:|
;;    | 1970-01-01 | foo |  0 |
;;    | 1970-01-02 | foo |  1 |
;;    | 1970-02-01 | foo |  2 |
;;    | 1970-02-02 | bar |  3 |

(-> data
  (adjust-interval tablecloth.time.api/->months-end)
  (tablecloth.api/ungroup)
;; => _unnamed [4 3]:
;;    |         :A |  :B | :C |
;;    |------------|-----|---:|
;;    | 1970-01-31 | foo |  0 |
;;    | 1970-01-31 | foo |  1 |
;;    | 1970-02-28 | foo |  2 |
;;    | 1970-02-28 | bar |  3 |

(-> data
  (adjust-interval tablecloth.time.api/->months-end)
  (tablecloth.api/aggregate (comp tech.v3.datatype.functional/mean :C))
;; => _unnamed [2 2]:
;;    | :summary |         :A |
;;    |---------:|------------|
;;    |      0.5 | 1970-01-31 |
;;    |      2.5 | 1970-02-28 |
ezmiller commented 3 years ago

After chatting with @rsiddharthan, we decide to make two changes:

ezmiller commented 2 years ago

Based on this still unfinished conversation with @rsiddharthan, I changed adjust-frequency to return an ungrouped dataset, unless the user overrides that behavior. This undoes the need for the include-columns argument that we were discussing. That said this conversation was unfinished and I hadn't grokked fully what @rsiddharthan was saying. Merging now anyway because lots of time has passed and I think this is a reasonable approach for now. We can always change course later.