techascent / tech.ml.dataset

A Clojure high performance data processing system
Eclipse Public License 1.0
680 stars 35 forks source link

left-join on char column fails #361

Closed genmeblog closed 1 year ago

genmeblog commented 1 year ago

Having the following setup, the exception is thrown

(def ds1 (ds/->dataset {:a '(\1 \2 \3 \4 \5 \6 \7 \8 \9)}))
(def ds2 (ds/->dataset {:a '(\0 \9 \8 \7 \6 \5 \4 \3 \2)}))

(ds1 :a)
;; => #tech.v3.dataset.column<char>[9]
;;    :a
;;    [1, 2, 3, 4, 5, 6, 7, 8, 9]

(ds2 :a)
;; => #tech.v3.dataset.column<char>[9]
;;    :a
;;    [0, 9, 8, 7, 6, 5, 4, 3, 2]

(j/left-join :a ds1 ds2)
1. Unhandled java.lang.RuntimeException
   Object cannot be casted to long: null

                Casts.java:   77  ham_fisted.Casts/longCast
           ArrayLists.java: 2142  ham_fisted.ArrayLists$CharArraySubList/fill
           ArrayLists.java:  178  ham_fisted.ArrayLists$ILongArrayList/fillRange
          array_buffer.clj:  116  tech.v3.datatype.array-buffer/ml-set-constant!
          array_buffer.clj:  114  tech.v3.datatype.array-buffer/ml-set-constant!
             protocols.clj:   97  tech.v3.datatype.protocols/eval16501/fn/G
                  base.clj:  861  tech.v3.datatype.base/set-constant!
                  base.clj:   -1  tech.v3.datatype.base/set-constant!
              datatype.clj:  557  tech.v3.datatype/set-constant!
              datatype.clj:  554  tech.v3.datatype/set-constant!
                column.clj:  451  tech.v3.dataset.impl.column/extend-column-with-empty
                column.clj:   -1  tech.v3.dataset.impl.column/extend-column-with-empty
                column.clj:  181  tech.v3.dataset.column/extend-column-with-empty
                column.clj:  179  tech.v3.dataset.column/extend-column-with-empty
                  join.clj:  189  tech.v3.dataset.join/finalize-join-result/fn
                  core.clj: 2770  clojure.core/map/fn
              LazySeq.java:   42  clojure.lang.LazySeq/sval
              LazySeq.java:   51  clojure.lang.LazySeq/seq
                   RT.java:  535  clojure.lang.RT/seq
                  core.clj:  139  clojure.core/seq
                  core.clj: 2774  clojure.core/map/fn
              LazySeq.java:   42  clojure.lang.LazySeq/sval
              LazySeq.java:   58  clojure.lang.LazySeq/seq
                 Cons.java:   39  clojure.lang.Cons/next
                   RT.java:  713  clojure.lang.RT/next
                  core.clj:   64  clojure.core/next
             protocols.clj:  169  clojure.core.protocols/fn
             protocols.clj:  124  clojure.core.protocols/fn
             protocols.clj:   19  clojure.core.protocols/fn/G
             protocols.clj:   31  clojure.core.protocols/seq-reduce
             protocols.clj:   75  clojure.core.protocols/fn
             protocols.clj:   75  clojure.core.protocols/fn
             protocols.clj:   13  clojure.core.protocols/fn/G
                  core.clj: 6886  clojure.core/reduce
                  core.clj: 6868  clojure.core/reduce
                  join.clj:   71  tech.v3.dataset.join/nice-column-names
                  join.clj:   63  tech.v3.dataset.join/nice-column-names
               RestFn.java:  421  clojure.lang.RestFn/invoke
                  join.clj:  194  tech.v3.dataset.join/finalize-join-result
                  join.clj:  134  tech.v3.dataset.join/finalize-join-result
                  join.clj:  293  tech.v3.dataset.join/hash-join
                  join.clj:  261  tech.v3.dataset.join/hash-join
                  join.clj:  337  tech.v3.dataset.join/left-join
                  join.clj:  327  tech.v3.dataset.join/left-join
                  join.clj:  335  tech.v3.dataset.join/left-join
                  join.clj:  327  tech.v3.dataset.join/left-join
genmeblog commented 1 year ago

When one dataset is used, everything works:

(j/left-join :a ds1 ds1)
;; => left-outer-join [9 2]:
;;    | :a | :right.a |
;;    |----|----------|
;;    |  1 |        1 |
;;    |  2 |        2 |
;;    |  3 |        3 |
;;    |  4 |        4 |
;;    |  5 |        5 |
;;    |  6 |        6 |
;;    |  7 |        7 |
;;    |  8 |        8 |
;;    |  9 |        9 |

(j/left-join :a ds2 ds2)
;; => left-outer-join [9 2]:
;;    | :a | :right.a |
;;    |----|----------|
;;    |  0 |        0 |
;;    |  9 |        9 |
;;    |  8 |        8 |
;;    |  7 |        7 |
;;    |  6 |        6 |
;;    |  5 |        5 |
;;    |  4 |        4 |
;;    |  3 |        3 |
;;    |  2 |        2 |