flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.68k stars 639 forks source link

[Plugin] Numpy dtypes integration #2753

Open cosmicBboy opened 2 years ago

cosmicBboy commented 2 years ago

Currently, only numpy arrays are supported by the numpy type extension. It would be useful to also support scalar data types as well.

So today, this code will not work:

import numpy as np
from flytekit import task, workflow

@task
def my_task() -> float:
    x = np.array([1,2,3])
    return x.mean()

@workflow
def wf():
    my_task()

wf()

Error:

TypeError: Failed to convert return value for var o0 for function my_task with error <class 'flytekit.core.type_engine.TypeTransformerFailedError'>: Expected value of type <class 'float'> but got type <class 'numpy.float64'>

Problem: flytekit doesn't know how to handle numpy.<dtype> scalar types. The code above will show up anywhere where a user wants to aggregate some values using numpy operations (including sklearn, and probably a bunch of others)

Of course the user can always do float(x.mean()) to convert it to a normal float, but then this kills UX because now we're forcing users to modify otherwise functioning code into some flytekit-nitpicky syntax.

Potential Solutions

  1. Create a set of type transformers for numpy scalar dtypes
  2. Modify existing primitive type transformers (see here) to recognize numpy types.

Resources

SmritiSatyanV commented 2 years ago

Could I work on this? @cosmicBboy

aryamans29002 commented 2 years ago

can i work on this @cosmicBboy

techytushar commented 2 years ago

Hey @cosmicBboy , I have created a PR to fix this https://github.com/flyteorg/flytekit/pull/1219 Please can you check if this is the right way to solve this, so I can go ahead and add the tests as well

P3rcy-8685 commented 2 years ago

I would like to work on this @cosmicBboy

samhita-alla commented 2 years ago

Since @techytushar created a PR already, I assigned the issue to Tushar.

github-actions[bot] commented 6 months ago

Hello šŸ‘‹, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! šŸ™

Monis-Ahmed-Rizvi commented 2 weeks ago

Iā€™d like to take on this issue if no one else is working on it.