flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.62k stars 622 forks source link

[Core feature] Plugin loading failures #2991

Open hamersaw opened 1 year ago

hamersaw commented 1 year ago

Motivation: Why do you think this is important?

Currently failures incurred during plugin setup cascade up through FlytePropeller startup and fail the entire application. This means a misconfigured plugin will stop a FlytePropeller Pod from successfully deploying.

Goal: What should the final outcome look like, ideally?

If a plugin fails to load without error it should be noted as 'disabled' (with error logs). Any tasks that require this plugin should note that the plugin has been disabled because of failure on load so that this is viewable in the UI.

Describe alternatives you've considered

If rather than labeling a plugin as "disabled" on startup failure we just do not load the plugin, then tasks requiring the plugin will "fallback" to the default plugin (ie. "container"). So we must add an additional "disabled" tracking structure.

Propose: Link/Inline OR Additional context

Here is where plugin loading errors is handled.

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

hamersaw commented 1 year ago

cc @honnix does this correctly summarize the plugin loading issue?

honnix commented 1 year ago

Thank you for putting this together. Looks great!

github-actions[bot] commented 11 months ago

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 11 months ago

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] commented 2 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏