massung / tabular-asa

A column-oriented, dataframe implementation for Racket.
MIT License
17 stars 4 forks source link

Bug when using table-join/inner where first table is filtered? #6

Closed jbclements closed 4 months ago

jbclements commented 4 months ago

It looks to me like using table-join/inner on an example where the first table is a filtered table causes an error. For instance, when I run this code:

#lang racket

(require tabular-asa)

(define students
  (table-read/columns '(("a" "b" "c" "d" "e" "f" "g" "h" "i")
                        ("Enrolled" "blah" "Waiting" "blah" "Enrolled" "Waiting"
                                    "Enrolled" "Waiting" "Enrolled")
                        (3 3 3 5 5 5 7 7 7))
                      '(Name Status Section)))

(define statuses-of-interest '("Enrolled"))

(define roster-table
  (table-read/columns '(("bob" "susan" "joe")
                        (3 5 7))
                      '(|Instructor Name| Section)))
;; filter to Enrolled or Enrolled+Waiting
(define filtered-students
  (table-filter students (λ (status) (equal? status "Enrolled")) '(Status)))

(define instructor-groups
  (table-join/inner filtered-students roster-table '(Section)))

I get the error:

../../Applications/Racket v8.12/collects/racket/private/kw.rkt:1418:57: vector-ref: index is out of range
  index: 4
  valid range: [0, 3]
  vector: '#(0 4 6 8)

Switching the order of the table arguments yields what appears to be the correct result. It looks like you can also hack around this by converting to rows and back, restoring the index to a simple 0..n-1 vector.

Any idea what's going on here?

jbclements commented 4 months ago

Could just be iterating over the length of the original table rather than the length of the index....

massung commented 4 months ago

Fixed in 080fb1b. Also added regression tests and also tested against the right table being filtered as well. Thanks for the bug report!