scicloj / tablecloth

Dataset manipulation library built on the top of tech.ml.dataset
https://scicloj.github.io/tablecloth
MIT License
288 stars 23 forks source link

specifying the type on add-or-remove-column #11

Open daslu opened 3 years ago

daslu commented 3 years ago

Currently (checked with https://github.com/scicloj/tablecloth/commit/a067803de29daf58240f720014cd02b6fa4bca46), it is possible to specify the type of a new column using a call to dtype-next.

(-> {:x [1 2]}
    (tablecloth.api/dataset)
    (tablecloth.api/add-or-replace-column
     :y (tech.v3.datatype/as-reader [1 2] :float32))
    :y)

#tech.v3.dataset.column<float32>[2]
:y
[1.000, 2.000, ]

It could be nice to allow that directly as an option of the tablecloth API, without requiring the user to know what a "reader" is.

genmeblog commented 3 years ago

Definitely user shouldn't know about the reader. There is a function for explicit datatype conversion (convert-types) which can be called after insterting a column. To enable add-or-remove-column probably API should be changed to accept a map as the last parameter. Currently it contains size-strategy. We need to think how to avoid breaking change here.

ashimapanjwani commented 3 years ago

@genmeblog Can I pick this up?

genmeblog commented 3 years ago

Yes! Thanks.