timeplus-io/sling-cli - Githubissues

Slings from a data source to a data target.

Sling is a passion project turned into a free CLI Product which offers an easy solution to create and maintain small to medium volume data pipelines using the Extract & Load (EL) approach. It focuses on data movement between:

Database to Database
File System to Database
Database to File System

https://github.com/slingdata-io/sling-cli/assets/7671010/e10ee716-1de8-4d53-8eb2-95c6d9d7f9f0

Some key features:

Single Binary deployment (built with Go). See installation page.
Use Custom SQL as a stream: --src-stream='select * from my_table where col1 > 10'
Manage / View / Test / Discover your connections with the sling conns sub-command
Use Environment Variable as connections if you prefer (export MY_PG='postgres//...)'
Provide YAML or JSON configurations (perfect for git version control).
Powerful Replication logic, to replication many tables with a wildcard (my_schema.*).
Reads your existing DBT connections
Use your environment variable in your YAML / JSON config (select * from my_table where date = '{date}')
Convenient Transformations, such as the flatten option, which auto-creates columns from your nested fields.
Run Pre & Post SQL commands.
many more!

Example Replication:

replication.yaml

Available Connectors:

Databases: bigquery bigtable clickhouse duckdb mariadb motherduck mysql oracle postgres redshift snowflake sqlite sqlserver starrocks prometheus proton
File Systems: azure b2 dospaces gs local minio r2 s3 sftp wasabi
File Formats: csv, parquet, xlsx, json, avro, xml, sas7bday

Here are some additional links:

Ever wanted to quickly pipe in a CSV or JSON file into your database? Use sling to do so:

cat my_file.csv | sling run --tgt-conn MYDB --tgt-object my_schema.my_table

Or want to copy data between two databases? Do it with sling:

sling run --src-conn PG_DB --src-stream public.transactions \
  --tgt-conn MYSQL_DB --tgt-object mysql.bank_transactions \
  --mode full-refresh

Sling can also easily manage our local connections with the sling conns command:

$ sling conns set MY_PG url='postgresql://postgres:myPassword@pghost:5432/postgres'

$ sling conns list
+--------------------------+-----------------+-------------------+
| CONN NAME                | CONN TYPE       | SOURCE            |
+--------------------------+-----------------+-------------------+
| AWS_S3                   | FileSys - S3    | sling env yaml    |
| FINANCE_BQ               | DB - BigQuery   | sling env yaml    |
| DO_SPACES                | FileSys - S3    | sling env yaml    |
| LOCALHOST_DEV            | DB - PostgreSQL | dbt profiles yaml |
| MSSQL                    | DB - SQLServer  | sling env yaml    |
| MYSQL                    | DB - MySQL      | sling env yaml    |
| ORACLE_DB                | DB - Oracle     | env variable      |
| MY_PG                    | DB - PostgreSQL | sling env yaml    |
+--------------------------+-----------------+-------------------+

$ sling conns discover LOCALHOST_DEV
9:05AM INF Found 344 streams:
 - "public"."accounts"
 - "public"."bills"
 - "public"."connections"
 ...

Installation

Brew on Mac

brew install slingdata-io/sling/sling

# You're good to go!
sling -h

Scoop on Windows

scoop bucket add sling https://github.com/slingdata-io/scoop-sling.git
scoop install sling

# You're good to go!
sling -h

Binary on Linux

curl -LO 'https://github.com/slingdata-io/sling-cli/releases/latest/download/sling_linux_amd64.tar.gz' \
  && tar xf sling_linux_amd64.tar.gz \
  && rm -f sling_linux_amd64.tar.gz \
  && chmod +x sling

# You're good to go!
sling -h

Compiling From Source

Requirements:

Install Go 1.22+ (https://go.dev/doc/install)
Install a C compiler (gcc, tdm-gcc, mingw, etc)

Linux or Mac

git clone https://github.com/slingdata-io/sling-cli.git
cd sling-cli
bash scripts/build.sh

./sling --help

Windows (PowerShell)

git clone https://github.com/slingdata-io/sling-cli.git
cd sling-cli

.\scripts\build.ps1

.\sling --help

Automated Dev Builds

Here are the links of the official development builds, which are the latest builds of the upcoming release.

Linux (x64): https://f.slingdata.io/dev/latest/sling_linux_amd64.tar.gz
Mac (arm64): https://f.slingdata.io/dev/latest/sling_darwin_arm64.tar.gz
Windows (x64): https://f.slingdata.io/dev/latest/sling_windows_amd64.tar.gz

Installing via Python Wrapper

pip install sling

Then you should be able to run sling --help from command line.

Running a Extract-Load Task

CLI

sling run --src-conn POSTGRES_URL --src-stream myschema.mytable \
  --tgt-conn SNOWFLAKE_URL --tgt-object yourschema.yourtable \
  --mode full-refresh

Or passing a yaml/json string or file

sling run -c '
source:
  conn: $POSTGRES_URL
  stream: myschema.mytable

target:
  conn: $SNOWFLAKE_URL
  object: yourschema.yourtable

mode: full-refresh
'
# OR
sling run -c /path/to/config.json

From Lib

package main

import (
    "log"

    "github.com/slingdata-io/sling-cli/core/sling"
)

func main() {
  // cfgStr can be JSON or YAML
    cfgStr := `
    source:
        conn: $POSTGRES_URL
        stream: myschema.mytable

    target:
        conn: $SNOWFLAKE_URL
        object: yourschema.yourtable

    mode: full-refresh
  `
    cfg, err := sling.NewConfig(cfgStr)
    if err != nil {
        log.Fatal(err)
    }

    err = sling.Sling(cfg)
    if err != nil {
        log.Fatal(err)
    }
}

Config Schema

An example. Put this in https://jsonschema.net/

--src-conn/source.conn and --tgt-conn/target.conn can be a name or URL of a folder:

MY_PG (connection ref in db, profile or env)
$MY_PG (connection ref in env)
postgresql://user:password!@host.loc:5432/database
s3://my_bucket/my_folder/file.csv
gs://my_google_bucket/my_folder/file.json
file:///tmp/my_folder/file.csv (local storage)

--src-stream/source.stream can be an object name to stream from:

TABLE1
SCHEMA1.TABLE2
OBJECT_NAME
select * from SCHEMA1.TABLE3
/path/to/file.sql (if source conn is DB)

--tgt-object/target.object can be an object name to write to:

TABLE1
SCHEMA1.TABLE2

Example as JSON

{
  "source": {
    "conn": "MY_PG_URL",
    "stream": "select * from my_table",
    "options": {}
  },
  "target": {
    "conn": "s3://my_bucket/my_folder/new_file.csv",
    "options": {
      "header": false
    }
  }
}