timescale / timescaledb-parallel-copy

A binary for parallel copying of CSV data into a TimescaleDB hypertable
https://www.timescale.com/
Apache License 2.0
360 stars 54 forks source link

Allowing use as a go module (feature request) #58

Closed bharathcs closed 4 months ago

bharathcs commented 2 years ago

First off, great work with this package, its been very useful!

However, this package could be more useful if it could support its use as a golang package rather than only as a command line program. While using stdin / stdout redirection instead of reading from a file is already useful, developers could make more functional programs if they could send a suitable reader to take the data from.

For example, people who routinely have to upload bulk data, with a script or otherwise, will be able to take better advantage of this package by writing their own go program.

Here is the rough idea I have:

Here's a possible example of another developer using this this package after such functionality has been achieved:

package main

import (
    "io"

    copy "github.com/timescale/timescaledb-parallel-copy"
)

func main() {
    c := copy.SetConnection( /* ... */ ).SetTableName( /* ... */ ).SetWorkers( /* ... */ )
    r, w := io.Pipe()
    go func() {
        writeCsvData(w)
        w.Close()
    }()

    err := c.Run(r, copy.WithConnection( /* ... */ ), copy.SetTableName( /* ... */ ), copy.SetWorkers(5), /* ... */ )
    if err != nil {
        panic(err)
    }
}

I encountered this use case personally, and I already did the upgrades to make it happen, and I'd be happy to clean up my work and make it in a proper PR if the devs at TimescaleDB are open to this feature request.

alejandrodnm commented 4 months ago

This was implemented in https://github.com/timescale/timescaledb-parallel-copy/pull/77