launchflow / buildflow

BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest. No configuration outside of the code is required.
https://docs.launchflow.com/buildflow
Apache License 2.0
193 stars 7 forks source link
batch data-science pipeline python streaming
BuildFlow Logo BuildFlow Logo
### **⚒️ Build your entire system in minutes using pure Python. ⚒️** ![CI](https://github.com/launchflow/buildflow/actions/workflows/python_ci.yaml/badge.svg) ![Release Tests](https://github.com/launchflow/buildflow/actions/workflows/release_tests.yaml/badge.svg) [![Python version](https://badge.fury.io/py/buildflow.svg)](https://pypi.org/project/buildflow) [![codecov](https://codecov.io/gh/launchflow/buildflow/branch/main/graph/badge.svg?token=AO0TP8XG7X)](https://codecov.io/gh/launchflow/buildflow) [![Discord](https://dcbadge.vercel.app/api/server/jRpkTAeEWx?style=flat)](https://discord.gg/jRpkTAeEWx)

📑 Resources

📖 [Docs](https://docs.launchflow.com/buildflow/introduction)   |   ⚡ [Quickstart](https://docs.launchflow.com/buildflow/quickstart)   |   👋 [Slack](https://join.slack.com/t/launchflowusers/shared_invite/zt-27wlowsza-Uiu~8hlCGkvPINjmMiaaMQ)   |   🌟 [Contribute](https://docs.launchflow.com/buildflow/developers/contribute)   |   🚀 [Deployment](https://www.launchflow.com/)  

🤔 What is BuildFlow?

BuildFlow is a Python framework that allows you to build your entire backend system using one framework. With our simple decorator pattern you can turn any function into a component of your backend system. Allowing you to serve data over HTTP, dump data to a datastore, or process async data from message queues. All of these can use our built in IO connectors allowing you to create, manage, and connect to your cloud resources using pure Python.

Key Features

Common Serving & Processing Patterns | 📖 Docs

Turn any function into a component of your backend system.

# Serve traffic over HTTP or Websockets
service = app.service("my-service")
@service.endpoint("/", method="GET")
def get():
    return "Hello World"

# Collect, transform, and write data to storage
@app.collector("/collect", method="POST", sink=SnowflakeTable(...))
def collect(request: Dict[str, Any]):
  return element

# Process data from message queues such as Pub/Sub & SQS
@app.consumer(source=SQSQueue(...), sink=BigQuery(...))
def process(element: Dict[str, Any]):
    return element

Infrastructure from Code | 📖 Docs

Create and connect to cloud resources using python (powered by Pulumi)

# Use Python objects to define your infrastructure
sqs_queue = SQSQueue("queue-name")
gcs_bucket = GCSBucket("bucket-name")

# Your application manages its own infrastructure state
app.manage(s3_bucket, gcs_bucket)

# Use the same resource objects in your application logic
@app.consumer(source=sqs_queue, sink=gcs_bucket)
def process(event: YourSchema) -> OutputSchema:
    # Processing logic goes here
    return OutputSchema(...)

Dependency Injection | 📖 Docs

Inject any dependency with full control over its setup and lifecycle

# Define custom dependencies
@dependency(Scope.GLOBAL)
class MyStringDependency:
    def __init__(self):
        self.my_string = "HELLO!"

# Or use the prebuilt dependencies
PostgresDep = SessionDepBuilder(postgres)

# BuildFlow handles the rest
@service.endpoint("/hello", method="GET")
def hello(db: PostgresDep, custom_dep: MyStringDependency):
    with db.session as session:
        user = session.query(User).first()
    # Returns "HELLO! User.name"
    return f"{custom_dep.my_string} {user.name}"

Async Runtime | 📖 Docs

Scale out parallel tasks across your cluster with Ray or any other async framework.

@ray.remote
def long_task(elem):
    time.sleep(10)
    return elem

@app.consumer(PubSubSubscription(...), BigQueryTable(...))
def my_consumer(elem):
    # Tasks are automatically parallelized across your cluster
    return await long_task.remote(elem)

⚙️ Installation

pip install buildflow

Extra Dependencies

Pulumi Installation

BuildFlow uses Pulumi to manage resources used by your application. To install Pulumi visit: https://www.pulumi.com/docs/install/

Installing Pulumi unlocks:

🩺 Code Health Checks

We use black and ruff with pre-commit hooks to perform health checks. To setup these locally:

📜 License

BuildFlow is open-source and licensed under the Apache License 2.0. We welcome contributions, see our CONTRIBUTING.md file for details.