anyscale / airflow-provider-ray

Ray provider for Apache Airflow
Apache License 2.0
47 stars 9 forks source link

Fault tolerance with Ray + Airflow #25

Open richardliaw opened 3 years ago

richardliaw commented 3 years ago

Proposal here: https://docs.google.com/document/d/1wnlgQMzuu0vpF5jc4K6y2fMzICjuKKLCRXJL82CdE-Q/edit#

Specifically, we want to allow the user to selectively persist objects:

@task.ray(persist=True | False, storage_path=”s3://...”)

Args:
    persist (bool): Whether to persist the output of the result to the provided 
        storage path. Tasks can only be retried if their parent has persisted 
        its outputs. If False, the output will remain in the ray object store.
    storage_path (str): Arbitrary URI.