grailbio / bigslice

A serverless cluster computing system for the Go programming language
https://bigslice.io/
Apache License 2.0
550 stars 35 forks source link
bigdata cluster computing etl go golang machinelearning mapreduce

Bigslice

Bigslice is a serverless cluster data processing system for Go. Bigslice exposes composable API that lets the user express data processing tasks in terms of a series of data transformations that invoke user code. The Bigslice runtime then transparently parallelizes and distributes the work, using the Bigmachine library to create an ad hoc cluster on a cloud provider.

Developing Bigslice

Bigslice uses Go modules to capture its dependencies; no tooling other than the base Go install is required.

$ git clone https://github.com/grailbio/bigslice
$ cd bigslice
$ GO111MODULE=on go test

If tests fail with socket: too many open files errors, try increasing the maximum number of open files.

$ ulimit -n 2000