kubeflow / community

Information about the Kubeflow community including proposals and governance information.
https://www.kubeflow.org/docs/about/community/
Apache License 2.0
167 stars 227 forks source link

Proposal: ML Experience Working Group #808

Open ederign opened 1 month ago

ederign commented 1 month ago

Hi everyone! 👋

After brainstorming with some community members about how to improve the Kubeflow User/Developer Experience for Data Scientists and ML practitioners, I decided to go one step further and start a formal discussion and propose a new IDE working group and its initial roadmap.

The IDE Working Group (potentially, Kubeflow Jupyter Extension WG) will be responsible for developing and integrating IDE-based tools and extensions to provide a streamlined user experience to data scientists and machine learning practitioners on Kubeflow.

WG IDE Charter

The IDE Working Group is responsible for developing and integrating IDE-based tools and extensions to provide a streamlined user experience to data scientists and machine learning practitioners on Kubeflow.

This charter adheres to the conventions, roles, and organization management outlined in wg-governance.

Scope

The IDE Working Group focuses on developing, maintaining, and improving tools and extensions that support data science and machine learning practitioners workflows within Kubeflow. The group is dedicated to delivering a high-level, seamless experience integrated with the IDE of choice across multiple Kubeflow components.

In scope

Code, Binaries, and Services

  1. Development of Kubeflow JupyterLab extensions that provide simple abstractions and UX to interact with the most common Kubeflow components (e.g., pipelines, hyperparameter tuning) and shorten the time to value for practitioners comfortable with Jupyter. These extensions will focus on the most used Kubeflow components, such as:

    • Pipelines;
    • Training Operator & Katib;
    • Model Registry;
    • Model Serving (KServe);
    • Feast
  2. Promote the reusability of UI components from other Kubeflow UIs into the IDE (e.g., rendering a pipeline graph inside the JupyterLab environment) by establishing a shared contract between the IDE WG and the wider Kubeflow community. 

  3. Develop a Python SDK to simplify operationalization across Kubeflow components and provide a "one-stop-shop" for practitioners who want easy access to Kubeflow services. The SDK also provides the groundwork for the IDE extension automation and workflows.

    • Create a single installation and configuration layer for users interacting programmatically with the Kubeflow ecosystem via SDKs.
    • The "common" SDK is not meant to replace individual components' SDKs but rather to offer a unified access layer to simplify dependency management and shared configuration (like authorization).

Guiding Principles

Cross-cutting and Externally Facing Processes

Out of scope


Working Group Roadmap Proposal

Vision

Development of Kubeflow JupyterLab extensions that provide simple abstractions and UX to interact with the most common Kubeflow components (e.g., pipelines, hyperparameter tuning) and shorten the time to value for practitioners comfortable with Jupyter. These extensions will focus on the most used Kubeflow components, such as Pipelines, Training Operator & Katib, Model Registry, Model Serving (Kserve), Feast, etc.

Phase 1 - Establish baseline (XX Months)

Goal: Baseline/starting point for Kubeflow IDE Extension

This phase will consist of three main tasks:

Task breakdown:

Kale: Note: @StefanoFioravanzo started this issue https://github.com/kubeflow/community/issues/730 and got great feedback and traction from the community.

Elyra Note: This work is already in progress by my group at Red Hat, together with the Elyra community.

Jupyter Scheduler

Phase 2 - Code Migration (XX Months)

Goal: code consolidated within the Kubeflow GitHub organization with proper code structure and naming

Phase 1 focused on establishing a baseline by demoing Kale and Elyra integrations successfully. In this phase we want to consolidate the Kale codebase under the Kubeflow organization. This new structure will allow us to work on top of Kale and iteratively build the new IDE experience for Kubeflow. Elyra will continue to be the interim solution for low-code visual pipeline authoring.

Phase 3 - Enhance IDE extension  (XX Months)

Goal: Add the visual authoring and the runtime pipeline visualization to the Kale baseline. With these new features Kubeflow can provide both a notebook-based and a visual/drag-and-drop-based authoring pipeline experience. We are also planning to provide the same visualization look and feel both on IDE and on the Kubeflow Central Dashboard.

Long-term plan

Goal: Kubeflow JupyterLab Extension MVP will provide a streamlined user experience to data scientists and machine learning practitioners across all components of the Kubeflow ecosystem.

ederign commented 1 month ago

CC @kubeflow/kubeflow-steering-committee @StefanoFioravanzo @andreyvelich

This proposal submission is a collaboration between @StefanoFioravanzo, @andreyvelich, and myself. We also got helpful feedback from multiple other community members.

ederign commented 1 month ago

This proposal is also related to the 'SDK discussion' on https://github.com/kubeflow/training-operator/issues/2402#issuecomment-2619160006

StefanoFioravanzo commented 1 month ago

@ederign thanks for migrating our notes and creating the issue! Looking forward to starting these efforts and can't wait to hear feedback from the community

andreyvelich commented 1 month ago

cc @zsailer @bigsur0 @shravan-achar @akshaychitneni

lresende commented 1 month ago

Thanks for the well-written proposal. Some of these align very well with the mission of the Elyra project. Given the synergy, it might be a good idea to explore how we could make some of these in the context of Jupyter/Elyra in particular as we are all projects related to the Linux Foundation. Please let me know if any specific meetings are happening in this area.

cc @caponetto @shalberd @romeokienzler

ederign commented 1 month ago

@lresende absolutely! We still need to wait for broader feedback from the community about the proposal, but if we agree to proceed, I'll make sure to invite Elyra folks to the discussions.

Griffin-Sullivan commented 1 month ago

I think this is a great idea and will enhance the overall UX with Kubeflow! I'd be happy to help out with any of the initiatives.

milosjava commented 1 month ago

Really detailed proposal, thank you very much for that! From my experience at Pepsico, Data Scientists often struggle to get familiar with Kubeflow, and companies typically need to develop a tool or library to help them use it effectively. Once implemented, this could definitely accelerate adoption.

tarekabouzeid commented 1 month ago

I think it's really great initiative that will improve Kubeflow usability. And thank you so much for the detailed explanation, great work! I would really like to help in this initiative.

andreyvelich commented 1 month ago

Hi Folks, I propose a new name for this Working Group: ML Experience. Given that we will develop many tools (Jupyter Extensions, SDK, re-usable UI components) that streamline ML Engineer experience. What do community think on this ?

StefanoFioravanzo commented 1 month ago

@andreyvelich before focusing on the name itself - do you confirm you are ok with the charter and the proposed action plan? Don't want to get hung up on naming in case there are aspects of the proposal that need to be discussed.

If the proposal looks ok, then let's discussing naming

andreyvelich commented 1 month ago

Sure, that sounds good to me @StefanoFioravanzo! In any case, let's talk about it at the next Kubeflow Community Call and covert this proposal to the PR in kubeflow/community.

ederign commented 1 month ago

Thank everyone for all the input here. I just submit a proposal for the kubeflow community: https://github.com/kubeflow/community/pull/824

varodrig commented 1 month ago

@ederign thank you for working on this proposal. I love the idea of user-centric approach basically when looking into how the different tools can make their journey easily by integrating or building new tools. I'm interested in joining.

ederign commented 4 weeks ago

@varodrig great! I would love your feedback at https://github.com/kubeflow/community/pull/824

RonakSingh55 commented 2 weeks ago

@ederign , could you provide some initial guidance or key resources to help me gain a better understanding of the project?

szaher commented 2 weeks ago

@ederign I would like to join WG if possible please

ederign commented 1 week ago

@RonakSingh55 @szaher, that is great! We are discussing the official proposal of the working group here: https://github.com/kubeflow/community/pull/824

andreyvelich commented 1 week ago

Let's keep it open until we finalize scope of ML Experience WG. /retitle Proposal: ML Experience Working Group

ederign commented 1 week ago

I just raised a new PR with a FUP of requested changes on https://github.com/kubeflow/community/pull/824.

Abhsihekkaul commented 3 days ago

Hi @ederign , @StefanoFioravanzo

I came across the opportunity to develop a JupyterLab Plugin for Kubeflow, and I’m highly interested in contributing to this project. With my experience in JavaScript, React, Python, and API integrations, I believe I can help create a seamless JupyterLab extension that integrates with Kubeflow Pipelines, Notebooks, Model Registry, and Training Operator.

I have experience in JupyterLab extensions, backend API development, and have worked on projects involving data processing, AI tools, and web applications. I am eager to modernize and consolidate existing solutions like Elyra, Kale, and Jupyter Scheduler into a unified plugin to enhance the Kubeflow ecosystem.

I would love to discuss how I can contribute effectively. Please let me know the next steps or if there’s any documentation I should review to get started.

Looking forward to your response!

Best regards, Abhishek kaul

ederign commented 2 days ago

Hi @Abhsihekkaul! That is great, and I'm looking forward to collaborating with you! We are in the process of setting up a place for us to start gathering! As soon as I have the Slack channel, I'll let everybody here know!

Abhsihekkaul commented 2 days ago

Ok @ederign

By the mean time shall i create a prototype of the implementation and craft my gsoc proposal and get a review from the team.

sudhathorat31 commented 1 day ago

Hi @ederign , @StefanoFioravanzo

I’m interested in contributing to the JupyterLab Plugin for Kubeflow and have started drafting my proposal for GSoC 2025 on this project.

Currently, I am pursuing my Master’s in Computer Science and am skilled in JavaScript, React, Python, and API integrations, with a strong focus on building scalable applications and intuitive user experiences.

What I have done to learn about Kubeflow:

Work in progress:

Looking forward to contribute to this project!