estuary / flow

🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊
https://estuary.dev
Other
638 stars 56 forks source link

CI Slack | Docs home | Free account | Data platform comparison reference | Email list

Build millisecond-latency, scalable, future-proof data pipelines in minutes.

Estuary Flow is a DataOps platform that integrates all of the systems you use to produce, process, and consume data.

Flow unifies today's batch and streaming paradigms so that your systems – current and future – are synchronized around the same datasets, updating in milliseconds.

With a Flow pipeline, you:

Get started

Ready to try out Flow? Sign up for free to get started! 🚀

Have questions? We'd love to hear from you:


Workflow Overview

Using Flow

Flow combines a low-code UI for essential workflows and a CLI for fine-grain control over your pipelines. Together, the two interfaces comprise Flow's unified platform. You can switch seamlessly between them as you build and refine your pipelines, and collaborate with a wider breadth of data stakeholders.

➡️ Sign up for a free Flow account here.

See the BSL license for information on using Flow outside the managed offering.

Resources

Support

The best (and fastest) way to get support from the Estuary team is to join the community on Slack.

You can also email us.

Connectors

Captures and materializations use connectors: plug-able components that integrate Flow with external data systems. Estuary's in-house connectors focus on high-scale technology systems and change data capture (think databases, pub-sub, and filestores).

Flow can run Airbyte community connectors using airbyte-to-flow, allowing us to support a greater variety of SaaS systems.

See our website for the full list of currently supported connectors.

If you don't see what you need, request it here.

How does it work?

Flow builds on a real-time streaming broker created by the same founding team called Gazette.

Because of this, Flow collections are both a batch dataset – they're stored as a structured "data lake" of general-purpose files in cloud storage – and a stream, able to commit new documents and forward them to readers within milliseconds. New use cases read directly from cloud storage for high-scale backfills of history, and seamlessly transition to low-latency streaming on reaching the present.

What makes Flow so fast?

Flow mixes a variety of architectural techniques to achieve great throughput without adding latency: