flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.63k stars 622 forks source link

[Core feature] Allow passing additional metadata in inputs/outputs of tasks/workflows #4654

Open EngHabu opened 9 months ago

EngHabu commented 9 months ago

Motivation: Why do you think this is important?

There are a few instances we faced in the past where we wanted to pass additional contextual information about the inputs/outputs of a task that are not themselves inputs/outputs (The most recent example is flyte decks). Unfortunately the way we had to pass that is by convention. The engine (i.e. Propeller) assumes a task writes decks, checks for the file existence and bails if it's not there. Conventions are not very strong :-) and limit the optimizations the system can make. Another instance of this is executing, generating and exporting input validation report from Great Expectations or Pandera. This is not really an output but rather metadata about the input/output dataframes.

Goal: What should the final outcome look like, ideally?

This is a complex change and its complexity stems from a few aspects:

  1. The system assumes Inputs and outputs are always LiteralMaps... which is an assumption we are breaking
  2. Backward compatibility, at any point in time users expect to use an old flytekit with a new backend and maybe an even older frontend and some 4th version of flyte propeller. Making the system backward and forward compatible is crucial for a smooth rollout.

Here is a proposed breakdown of the work particularly around the component boundaries:

The rollout plan is:

  1. Update flyteadmin to latest
  2. Update flyteconsole, copilot, propeller, flytekit to latest in any order

As long as flyteadmin is updated first, the system assumptions will hold.

Describe alternatives you've considered

Continue to use conventions to pass data... not good

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

dosubot[bot] commented 9 months ago

šŸ¤–

Hello @EngHabu!

I'm Dosu, a friendly bot here to assist you while we wait for a human maintainer to chime in. I'm here to help you solve bugs, answer questions, and navigate your way into becoming a contributor. Just remember, I'm a bot, but I'm here to make your interaction with this repository as smooth as possible.

Let's get cracking on this!

Thank you for your detailed request. I agree that passing additional metadata in inputs/outputs of tasks/workflows can be a valuable feature for Flyte.

As you've mentioned, this is a complex change due to the system's assumption that Inputs and Outputs are always LiteralMaps and the need for backward compatibility.

To implement this feature, we can extend the existing protobuf definitions to include additional fields for metadata. For example, in the TaskMetadata struct, we could add a new field for metadata:

type TaskMetadata struct {
    // existing fields...
    Metadata map[string]string `protobuf:"bytes,11,rep,name=metadata,proto3" json:"metadata,omitempty" protobuf_key:"bytes,1,opt,name=key,proto3" protobuf_val:"bytes,2,opt,name=value,proto3"`
}

This would allow us to pass arbitrary key-value pairs as metadata. The same can be done for workflows.

To ensure backward compatibility, we would need to make sure that any code that reads these protobuf messages can handle the case where the new Metadata field is not present. This can be done by checking if the field is nil before trying to access it:

func (m *TaskMetadata) GetMetadata() map[string]string {
    if m != nil {
        return m.Metadata
    }
    return nil
}

For a smooth rollout, we could first deploy a version of your code that can handle the new Metadata field but does not yet produce it. Once all systems are running the new code, we can start producing messages with the new field.

Please note that this is a high-level overview of the changes that would be needed. The actual implementation might require additional changes depending on the specifics of your system.

I hope this helps! If you have any further questions or need clarification on any points, please don't hesitate to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a šŸ‘ if this is helpful and šŸ‘Ž if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.