DAGWorks-Inc / hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
https://hamilton.dagworks.io/en/latest/
BSD 3-Clause Clear License
1.58k stars 95 forks source link

Flytekit integration #73

Closed elijahbenizzy closed 4 days ago

elijahbenizzy commented 1 year ago

Is your feature request related to a problem? Please describe. We should be able to create a flyte workflow from Hamilton functions. i.e. recreate:

from flytekit import task, workflow

@task
def sum(x: int, y: int) -> int:
   return x + y

@task
def square(z: int) -> int:
   return z * z

@workflow
def my_workflow(x: int, y: int) -> int:
   return sum(x=square(z=x), y=square(z=y))

print(f"my_workflow output: {my_workflow(x=1, y=2)}")

in a Hamiltonesque way.

Describe the solution you'd like We should be able to recreate the above by doing something like this:

# my_funcs.py
def square_x(x: int) -> int:
       return x * x

def square_y(y: int) -> int:
       return y * y

def sum_square_x_y(square_x:int, square_y: int) -> int:
       return square_x + square_y
from hamilton import driver
from hamilton.experimental import h_flyte
import my_funcs

fga = h_flyte.FlyteGraphAdapter(...)
dr = driver.Driver({}, my_funcs, adapter=fga)
result = dr.execute(["sum_square_x_y"], inputs={"x": 1, "y": 2})
print(result)

Describe alternatives you've considered TBD.

Additional context Docs:

This feels very similar to https://www.prefect.io/ v2.0 -- so maybe whatever pattern we come up with here would also help provide integration there.

elijahbenizzy commented 1 year ago

Copy of https://github.com/stitchfix/hamilton/issues/139 -- automated migration script did not work for this one.