massung / tabular-asa

A column-oriented, dataframe implementation for Racket.
MIT License
17 stars 4 forks source link

`table-groupby` returns incorrect results after `table-select` #2

Closed samdphillips closed 5 months ago

samdphillips commented 2 years ago

In this example program the subtables from the table-groupby operation are incorrect.

#lang racket/base

(require tabular-asa
         racket/sequence)

(define tbl
  (table-read/sequence
   '((a 1)
     (b 2)
     (c 3)
     (a 4)
     (b 5)
     (c 6))
   '(name value)))

(define select-seq
  (sequence-map
   (lambda (v)
     (or (eq? (car v) 'a) (eq? (car v) 'c)))
   (table-rows (table-cut tbl '(name)))))

(define tbl2
  (table-select tbl select-seq))

(define grouping (table-groupby tbl2 '(name)))

(displayln tbl)
(displayln tbl2)
(for ([(k t) grouping])
    (displayln k)
    (displayln t)
    (newline))

Program output:

       name   value
   0      a       1
   1      b       2
   2      c       3
   3      a       4
   4      b       5
   5      c       6

[6 rows x 2 cols]

       name   value
   0      a       1
   2      c       3
   3      a       4
   5      c       6

[4 rows x 2 cols]

((name a))
       value
   0       1
   2       3

[2 rows x 1 cols]

((name c))
       value
   1       2
   3       4

[2 rows x 1 cols]
massung commented 5 months ago

Noting that I have a fix for this. Working on adding some additional tests to catch it in the future.

massung commented 5 months ago

Fixed by f0c8790. Note: also fixed table-groupby where the table index isn't in the natural order (e.g. (table-groupby (table-reverse ...)) would break as well).