metrico / quackpipe

QuackPipe is an OLAP API built on top of DuckDB with a few extra ClickHouse compatibility bits
https://quackpipe.fly.dev
MIT License
149 stars 6 forks source link

Create a mechanism to write parquet files into a specific hard drive folder and merge them in a while #30

Open akvlad opened 1 month ago

akvlad commented 1 month ago

How

Create a service implementing the following interface and helper types

type Table struct {
    Name    string
    Path    string
    Fields  [][2]string // field name and type
    OrderBy []string
}

type IMergeTree interface {
    Store(table *Table, columns []string, data []any) error
    Merge(table *Table) error
}

The method Store(table *Table, columns []string, data []any) error description

The Store method should

The Merge method should

Testing

The following request should create a parquet file

    var mt IMergeTree = mt
    mt.Store(&Table{
        Name:    "example",
        Path:    "/tmp/example",
        Fields:  [][2]string{{"timestamp", "UInt64"}, {"str", "String"}, {"value", "Float64"}},
        OrderBy: []string{"timestamp"},
    }, []string{"timestamp", "str", "value"}, []any{
        []uint64{1628596000, 1628596001, 1628596002},
        []string{"a", "b", "c"},
        []float64{1.1, 2.2, 3.3},
    })

Create a set of unit tests for the positive scenario and several negative scenarios:

github-actions[bot] commented 1 month ago

Thanks for opening an Issue! Please star this repository to motivate developers! :star:

akvlad commented 2 weeks ago
var t = &Table {
    Name: "experimental"
    Path    "...."
    Fields  {"a", "UInt64", "b", "String", "x": "Float64"}
    OrderBy {"a"}
}

mt.Store(t, []string{"x", "b", "a"}, []any{
        []float64{1.1, 2.2, 3.3},
        []string{"a", "b", "c"},
        []uint64{1628596000, 1628596001, 1628596002},
    })

parquet file as a result:

a b x
1628596000 "a" 1.1
1628596001 "b" 2.2
1628596002 "c" 3.3