techascent / tech.ml.dataset

A Clojure high performance data processing system
Eclipse Public License 1.0
678 stars 35 forks source link

Do `partition` and `partition-by` make any sense here in TMD? #379

Open harold opened 1 year ago

harold commented 1 year ago

Partition, I guess, would turn a dataset into a sequence of datasets - with the similar semantics and arguments to clojure.core/partition

partition-by is a little less clear to me, perhaps the function would get passed the map-like rows, and then split when that function starts returning a different value?

genmeblog commented 1 year ago

I think it's a good idea! It was discussed in the past, take a look at this topic:

https://clojurians.zulipchat.com/#narrow/stream/236259-tech.2Eml.2Edataset.2Edev/topic/partition-by.20at.20tablecloth

and following issue in TC

https://github.com/scicloj/tablecloth/issues/30