kelindar / column

High-performance, columnar, in-memory store with bitmap indexing in Go
MIT License
1.44k stars 57 forks source link

no way to issue where clause `a in ("1", "2") and b in ("5", "6")` #55

Closed on99 closed 2 years ago

on99 commented 2 years ago
    c := column.NewCollection()
    c.CreateColumn("a", column.ForString())
    c.CreateColumn("b", column.ForString())
    c.CreateIndex("a_1", "a", func(r column.Reader) bool { return r.String() == "1" })
    c.CreateIndex("a_2", "a", func(r column.Reader) bool { return r.String() == "2" })
    c.CreateIndex("a_3", "a", func(r column.Reader) bool { return r.String() == "3" })
    c.CreateIndex("b_4", "b", func(r column.Reader) bool { return r.String() == "4" })
    c.CreateIndex("b_5", "b", func(r column.Reader) bool { return r.String() == "5" })
    c.CreateIndex("b_6", "b", func(r column.Reader) bool { return r.String() == "6" })

    c.Query(func(txn *column.Txn) error {
        fmt.Println(txn.InsertObject(map[string]interface{}{
            "a": "1",
            "b": "4",
        }))
        fmt.Println(txn.InsertObject(map[string]interface{}{
            "a": "2",
            "b": "5",
        }))
        fmt.Println(txn.InsertObject(map[string]interface{}{
            "a": "3",
            "b": "6",
        }))
        return nil
    })
    c.Query(func(txn *column.Txn) error {
        // no way to issue where clause `a in ("1", "2") and b in ("5", "6")`, count should be 1
        fmt.Println(txn.Union("a_1", "a_2").Union("b_5", "b_6").Count()) // 3
        fmt.Println(txn.With("a_1", "a_2").With("b_5", "b_6").Count()) // 0
        fmt.Println(txn.With("a_1", "a_2").Union("b_5", "b_6").Count()) // 2
        fmt.Println(txn.Union("a_1", "a_2").With("b_5", "b_6").Count()) // 0
        return nil
    })
kelindar commented 2 years ago

Indeed, this is a limitation of the current API. To properly solve for this we would need multi-column index I believe, so you could build an index that takes both a and b columns and applies the predicate on both. Currently, you would only be able solve it with Range() itself.