Closed lkyhfx closed 9 months ago
Hi! Thanks for the issue - very interesting one.
Maybe something did change here, because something similar came up recently on zulip: https://clojurians.zulipchat.com/#narrow/stream/236259-tech.2Eml.2Edataset.2Edev/topic/select.20via.20boolean.20broken.3F
In your case here, you can get the result you expect by not using a clojure vector:
user> (require '[tech.v3.dataset :as ds])
nil
user> (def ds (ds/->dataset [{:a 1 :b 2} {:a 2 :b 3} {:a 3 :b 4}]))
#'user/ds
user> (ds/select ds :all [true true true])
_unnamed [3 2]:
| :a | :b |
|---:|---:|
| 2 | 3 |
| 2 | 3 |
| 2 | 3 |
user> (ds/select ds :all (dtype/make-list :boolean [true true true]))
_unnamed [3 2]:
| :a | :b |
|---:|---:|
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
As we discovered in that zulip exchange, the types are determined in a fairly specific way: https://github.com/techascent/tech.ml.dataset/blob/cdb71cf623f426a2b5e883aeb355377892944bbb/src/tech/v3/dataset/impl/column.clj#L198
So, ...
user> (dtype/elemwise-datatype [true true true])
:object
user> (dtype/elemwise-datatype (dtype/make-list :boolean [true true true]))
:boolean
Would be interested to know what @cnuernber thinks about this now that it has come up twice. (:
PR: https://github.com/techascent/tech.ml.dataset/pull/390
Lots of options here, not sure this is the best, but we can iterate from here... (:
Run the following example
This produces:
However, the documentation for
select
states:When passing a boolean list, the position of
true
will be selected. Therefore, it should produce:Is the documentation outdated?