cncf / toc

⚖️ The CNCF Technical Oversight Committee (TOC) is the technical governing body of the CNCF Foundation.
https://cncf.io
1.65k stars 628 forks source link

[Proposal] TAG Artificial Intelligence #1048

Closed AlexsJones closed 7 months ago

AlexsJones commented 1 year ago

Proposal TAG Artificial Intelligence

Dear TOC, we bring to your attention this proposal for a new TAG for the consideration of your group.

Prior to making this request, much research was conducted about suitability of existing TAGs or working groups, however it became increasingly apparent that the breadth, significance and multi-faceted domain of Artificial Intelligence, could potentially ( pending your review ) warrant a new TAG.

Thank you for your consideration.

Introduction

Igniting the mainstream consciousness with the availability of general-purpose large language models, Artificial Intelligence (AI) is set to change the world in ways we are yet to understand fully. Conversational AI might be the first interaction that many end-users experience, yet many real-world applications go far beyond simple interactions.

Cloud-native technology is a domain that is particularly susceptible to the near term inclusive of AI given its highly interoperable ecosystem and extensive open-source communities.

It is incumbent upon the community to build functions to safely appraise and advise on this advancement in technology.

Mission Statement

This TAG aims to advocate for, develop, support, and help evaluate Artificial Intelligence initiatives in cloud native technologies. This TAG will acknowledge the complex ethical, legal and social issues that AI creates and look to build rich community knowledge that helps empower informed growth and development.

Responsibilities & Deliverables

Background

We recognize that:

In-Scope

Out of scope

Deliverables to TOC

Audiences

Operations

Artificial Intelligence TAG operations are consistent with standard TAG operating guidelines provided by the CNCF Technical Oversight Committee TOC.

hannibalhuang commented 1 year ago

big +1 to the initiative, as we also talked about back at KubeCon to launch an effort to build open source cloud native LLM infra

TheFoxAtWork commented 1 year ago

@AlexsJones thank you so much for initiating this proposal! There has been a lot of discussion around AI/ML/LLM as of late. We're currently document/refining the process for establishing and onboarding new TAGs. Given the discussion there, I would recommend bringing this proposal to TAG Runtime first as a WG. FYI @nikhita @RichiH @cathyhongzhang as TOC Liaisons. CC @raravena80 , @quinton-hoole as TAG Chairs.

mrbobbytables commented 1 year ago

I can get behind a TAG to help oversee this space with a caveat.

LLMs and conversational AI are bringing a lot of attention to the space, but I would encourage folks not to focus on it. The field is broader, and there's been a gradual growth in Cloud Native AI/ML, HPC & Data Processing for several years now. They are intrinsically linked with each other, but many focus too far up the stack.

There are several projects and groups that would already fall under this expanded umbrella:

There are also a slew of projects outside the CNCF that fall into this space, flux-framework, YuniKorn etc.

The research user group also has a good idea on gaps and where further investment is needed in the space and I would encourage a strong collaboration with them.

xmulligan commented 1 year ago

This seems to be slightly at odds with the suggested update to TAG Runtime https://github.com/cncf/toc/pull/1049/files

RichiH commented 1 year ago

FYI, CNCF Governing Board's Legal Committee is also currently discussing this overall space.

TOC can sync back any design docs / suggestions / etc, yet we likely need to sync more widely for making an active delegation for this than is usual.

AlexsJones commented 1 year ago

Thank you for all the comments, especially those from @TheFoxAtWork @mrbobbytables and @RichiH for giving some additional context.

I wholeheartedly concur that this is an expansive area of research with numerous organizations exploring various aspects. Building upon Bob's comment, it's crucial to dispel the misconception that AI is synonymous with LLM. On the contrary, the recent litmus test of mass "GPT" adoption - has demonstrated that the level of technological advancement is aligning with the user's ability to utilize it, which indicates exciting developments on the horizon within our interoperable, cloud-native ecosystem.

In truth, my motivation is less about where this research lives within the CNCF but more to ensure we have a safe harbour where we can work with the community to build a strong ethical framework around the implications of the adoption and usage of Artificial Intelligence.

I agree with Emily and will proactively collaborate with the TAG Runtime leadership to understand the WG's role beyond the discussion of LLM infrastructure architecture. Moreover, I have the privilege of knowing several colleagues from the academic research community in London and Finland who would be thrilled to be introduced to the cloud-native ecosystem. They possess a wealth of knowledge and expertise in the training of synthetic behaviour LLMs and task-oriented AI, and I believe their contributions would be invaluable.

Aisuko commented 1 year ago

Hi, guys. I'd like to share my opinions.

Here I agree with @AlexsJones. In my personal experiences these several months, I saw so many startup projects with LLVMs, they are with different programming languages and LLVMs.

And many of these projects can run on the customer hardware and they are not limited by only one LLVM(They can combine many LLVMs together) but I have not heard bout there any of the org announcing something like "Three Laws of Robotics".

So, here I believe CNCF can have some rules or frameworks to guide these AI-related projects. Make sure these startup AI projects get a great incubator environment. Here is an example, as a maintainer of the Service-mesh community, I saw that CNCF/smi was help define "A standard interface for service meshes on Kubernetes", this is an invaluable contribution to the whole domain.

Thanks

caniszczyk commented 1 year ago

"In truth, my motivation is less about where this research lives within the CNCF but more to ensure we have a safe harbour where we can work with the community to build a strong ethical framework around the implications of the adoption and usage of Artificial Intelligence."

+100 and note that the LF and CNCF is happy to support this in another way. There are also some sister foundations in the LF like PyTorch Foundation and https://landscape.lfai.foundation that are working in this area as an FYI

AlexsJones commented 1 year ago

"In truth, my motivation is less about where this research lives within the CNCF but more to ensure we have a safe harbour where we can work with the community to build a strong ethical framework around the implications of the adoption and usage of Artificial Intelligence."

+100 and note that the LF and CNCF is happy to support this in another way. There are also some sister foundations in the LF like PyTorch Foundation and https://landscape.lfai.foundation that are working in this area as an FYI

Thank you for sharing this; truly there is work being done within LF AI & Data that is exemplary.

I perceive academic research and work thriving in other foundations as guiding lights, illuminating the path for end-users in the rapidly evolving AI landscape. Within this context, an exceptional opportunity presents itself. While other entities conduct research, provide advice, and guide projects, we find ourselves in a distinctive position to offer the cloud-native community a glimpse into that realm.

We can establish an advisory group equipped with deep expertise tailored to the needs, inquiries, and experiments of our vendors, communities, and members. And If we end up inspiring cross-pollination between foundations, then I see that as an immense triumph of community collaboration.

raravena80 commented 1 year ago

"In truth, my motivation is less about where this research lives within the CNCF but more to ensure we have a safe harbour where we can work with the community to build a strong ethical framework around the implications of the adoption and usage of Artificial Intelligence."

Aside legal and ethical aspects, in tech terms (to me) this appears to be projects related to ML/AI diagnostics/explainability/responsibility/observability. I'm familiar with the space, but as "cloud native" stands now I don't see it tie directly to "cloud native infrastructure and apps" (Other than these types of solutions could run on top of something like K8s -- sometimes they are only run on Jupyter or data science notebooks).

Wrt "cloud native", besides running NLP/LLM workloads, I think there is an interesting space where you can ask/answer questions about cloud native environments (Note: answers could be inaccurate sometimes 😄 )

For example, questions like:

Imo, these fall within the scope of TAG-Observability.

In any case, I think there are plenty of opportunities to collaborate.

mudler commented 1 year ago

For example, questions like:

* What is the health of my K8s cluster?

* Tell me how I can improve the utilization of my servers or storage?

* What gaps are there in my cloud native environment?

* What parts of my infrastructure have not been used in the last three months?

* Are there any bottlenecks in my application?

* Etc, etc

Imo, these fall within the scope of TAG-Observability.

Apologize to jump-in. I partly agree, because really depends on the projects that will show-up in the future. For instance: AutoGPT applied to Kubernetes would fall off from observability's plate.

An AI TAG sounds a solid way to oversee the landscape in the future.

arun-gupta commented 11 months ago

AI/ML working group is also being proposed at OpenSSF https://github.com/ossf/tac/issues/175. There may be an opportunity to collaborate.

TheFoxAtWork commented 11 months ago

Yep - we've been tracking it. Coincidentally, this would be an excellent use case for https://github.com/cncf/toc/issues/889

pacoxu commented 11 months ago

Imo, these fall within the scope of TAG-Observability.

https://github.com/cncf/sandbox/issues/38 k8sgpt is applying for sandbox project in TAG Observability.

TheFoxAtWork commented 10 months ago

@nikhita @RichiH @cathyhongzhang Please coordinate with TAG Runtime on the WG discussion for this. From my recollection, we discussed this as a potential joint or two WGs, one in runtime and one in Observability but this needs more coordination and discussion. CC @alolita @erinaboyd @rochaporto

kerthcet commented 10 months ago

Also cc @ahg-g @alculquicondor as part of the kubernetes batch WG and working on several projects related.

halcyondude commented 10 months ago

For example, questions like:

  • What is the health of my K8s cluster?
  • Tell me how I can improve the utilization of my servers or storage?
  • What gaps are there in my cloud native environment?
  • What parts of my infrastructure have not been used in the last three months?
  • Are there any bottlenecks in my application?
  • Etc, etc

Imo, these fall within the scope of TAG-Observability.

In any case, I think there are plenty of opportunities to collaborate.


Regarding scoping and TAG Observability, I concur! Opportunities abound.

Here's an excerpt from our charter's "in-scope" section:

  • Projects that incorporate novel & insightful approaches to utilizing observability data such as:
    • ML, model training, Bayesian networks, and other data science techniques that enable anomaly & intrusion detection.
    • Correlating resource consumption with costing data to reduce the total cost of cloud native infrastructure
    • Using observability data exposed by service meshes, orchestrators, and other metric sources to inform continuous deployment tooling (e.g. Canary Predicates/Judges).
raravena80 commented 9 months ago

All, TAG-Runtime is good with having a joined scope WG with TAG-Observability.

I believe, we can get started with gathering interested in creating a charter with community members.

rajaskakodkar commented 9 months ago

I can help with spinning up the charter for this WG by collaborating with TAG Runtime and TAG Observability and the community on the charter!

cathyhongzhang commented 9 months ago

Thanks, @AlexsJones and everyone for proposing this and all the discussions! As a TOC member and TOC liaison of tag-runtime and tag-observability, I am pleased to see this AI initiative. AI poses new challenges to the cloud, such as higher resource costs for LLM training. There are exciting opportunities to develop solutions to fill the gaps. The scope could be expansive. We can start with an AI WG and if needed could evolve it into a TAG later on. @rajaskakodkar thanks for drafting the charter. How do you plan to get it reviewed?

rootfs commented 9 months ago

This is a significant initiative. I fully support this.

In addition to the content from @AlexsJones, I also have the following proposal (full disclosure: some ideas are from ChatGPT)

I would appreciate if the TAG can also address data privacy protection, provide guidelines how AI models should properly use community generated data, and promote generative AI tooling to improve productivity and community health.

Specifically I propose to target these persona:

I also hope the TAG can provide these guidelines to address the outlined problem and to safeguard intellectual property while promoting generative AI within CNCF:

I suggest the following action items within this TAG:

AlexsJones commented 9 months ago

Thank you for the feedback, comments and thoughtful suggestions shared thus far. @rajaskakodkar I look forward to working with you on the charter at your nearest convenience. @cathyhongzhang absolutely agree with the sentiment that this is a good litmus test for something small that can be expanded if need be.

@rootfs I feel your position and suggestions are valuable and would welcome you to be involved.

If nothing has been started in a google docs, I would be happy to bootstrap.

Please do join me in #wg-artificial-intelligence CNCF slack.

rootfs commented 9 months ago

There have been many discussions at CNCF slack. I integrated these discussion to @AlexsJones proposal into a google doc here

jeremyeder commented 9 months ago

Yes, indeed the timing is great. A TAG would help with alignment within the CNCF landscape. The OSI is preparing a definition of "open source AI" (see https://opensource.org/deepdive/ and https://blog.opensource.org/to-trust-ai-it-must-be-open-and-transparent-period/) which the CNCF community should monitor and help shape.

mkorbi commented 9 months ago

I like the proposal but barely see any huge differentiation to an LF AI & Data. Many relevant parts to be discussed on are originated in other TAGs. And the "political" site is covered on, well, more political parts of the organization.

However, I think what we could/would need is a structure to catch vertical demands such as AI or IoT/Edge, maybe heavily regulated fields such as healthcare etc. This would be more of a bridge between the developing community and the end users.

cathyhongzhang commented 9 months ago

I like @mkorbi's suggestion. I think catching the vertical demands of AI and identifying the cloud native gaps to support the new market would be valuable.

raravena80 commented 9 months ago

👍 I think the charter has to have a clear cloud native angle.

mrbobbytables commented 9 months ago

I don't know how active they are, but there are some verticals that DO exist already - IoT/Edge is under TAG Runtime, there was a Telco User Group, and I believe a few others that focus on verticals (cc @onlydole)

rootfs commented 9 months ago

Alex has setup a meeting to f2f discussion, it is 9AM PT/12PM ET/17PM EU(London) time

WG-artificial-intelligence
Thursday, 5 October · 17:00 – 18:00
Time zone: Europe/London
Google Meet joining info
Video call link: https://meet.google.com/uym-xijn-zcy
rootfs commented 9 months ago

I don't think this TAG or WG need redo what other foundations and CNCF community has already done. AI has many use cases to address, this TAG or WG should identify the goals that are uniquely positioned within CNCF.

onlydole commented 9 months ago

I want to raise this to the End User Technical Advisory Board (TAB) regarding concerns, questions, or commentary from our end user community on GenAI and other topics on getting involved. This group will be formed by the end of October, and we can raise this in some of the initial meetings and talk with the TOC (though this doesn't have any bearing on this decision, FWIW!)

amye commented 9 months ago

Alex has setup a meeting to f2f discussion, it is 9AM PT/12PM ET/17PM EU(London) time


WG-artificial-intelligence
Thursday, 5 October · 17:00 – 18:00
Time zone: Europe/London
Google Meet joining info

Let's get y'all on the TAG Runtime zoom, it's way way easier to manage meeting recordings + Zoombombing (pray it never happens)

amye commented 9 months ago

https://tockify.com/cncf.public.events/detail/658/1696521600000?search=Discussion%20for%20forming%20AI%20Working%20Group is on the calendars with notes + a zoom room and everything's ready to go

AlexsJones commented 9 months ago

https://tockify.com/cncf.public.events/detail/658/1696521600000?search=Discussion%20for%20forming%20AI%20Working%20Group is on the calendars with notes + a zoom room and everything's ready to go

Thanks for this Amy!

yuzisun commented 8 months ago

I can get behind a TAG to help oversee this space with a caveat.

LLMs and conversational AI are bringing a lot of attention to the space, but I would encourage folks not to focus on it. The field is broader, and there's been a gradual growth in Cloud Native AI/ML, HPC & Data Processing for several years now. They are intrinsically linked with each other, but many focus too far up the stack.

There are several projects and groups that would already fall under this expanded umbrella:

There are also a slew of projects outside the CNCF that fall into this space, flux-framework, YuniKorn etc.

The research user group also has a good idea on gaps and where further investment is needed in the space and I would encourage a strong collaboration with them.

KServe is a project focusing on standardizing ML model serving in the cloud native environment which was originally a sub-project under Kubeflow umbrella. Currently the project is hosted under LFAI, we are considering moving the project to CNCF as it plays a vital role in cloud native AI infrastructure, serving as a critical step in the ML lifecycle for deploying AI models into production to realize the business values.

johnugeorge commented 8 months ago

I can get behind a TAG to help oversee this space with a caveat.

LLMs and conversational AI are bringing a lot of attention to the space, but I would encourage folks not to focus on it. The field is broader, and there's been a gradual growth in Cloud Native AI/ML, HPC & Data Processing for several years now. They are intrinsically linked with each other, but many focus too far up the stack.

There are several projects and groups that would already fall under this expanded umbrella:

There are also a slew of projects outside the CNCF that fall into this space, flux-framework, YuniKorn etc.

The research user group also has a good idea on gaps and where further investment is needed in the space and I would encourage a strong collaboration with them.

Adding more context Kubeflow is a CNCF MLOps incubating project and shares a similar vision in the AI space. It provides a full cloud native stack for the whole ML workflow with various components orchestrating model building, training, tuning and inference. We have integrations with other related CNCF projects like volcano, Kueue etc

We are a mature MLOps platform active since 2017. I am looking forward to provide more shape to this initiative and collaborate with the community.

cartermp commented 8 months ago

On one hand, I'd like to see a TAG form up. Having built with LLMs (and been in production) for most of this year, it's exciting but also an area in dire need of better tools and guidance. I'd love to contribute to that.

But is that really the job of the CNCF? Feels out of scope to me as there's no obvious cloud-native tie-in. I sort of see AI as cross-cutting and orthogonal across all industries, just as large and varied as "cloud" is. Maybe an outcome of having a CNCF TAG is that it spins out into a wholly separate organization with its own TAGs and projects.

caniszczyk commented 8 months ago

@cartermp I think it depends on your perspective, a lot of folks that are using AI at scale to run/build LLMs tend to use a lot of CNCF projects to make this happen. There's a group of folks here in a foundation that cuts across major clouds and vendors that are willing to collaborate on something. There's no other place really in the industry that has this and we had success with other initiatives like Serverless https://www.cncf.io/blog/2018/02/14/cncf-takes-first-step-towards-serverless-computing/ and or even the recent Wasm WG https://github.com/cncf/tag-runtime/issues/58

In the LF, there are other foundations like LFAI, Generative AI Commons, PyTorch etc that all focus on specific things. There can be areas there to collaborate but they tend to be smaller.

If folks really want to get together and do something that the TOC considers in scope, we should give folks some space to collaborate.

cathyhongzhang commented 8 months ago

I agree with @caniszczyk. I see a lot of enthusiasm from the community for a space to collaborate. LLM drives a quantum leap in complexity which has resulted in new requirements for cloud resource orchestration and cloud Infrastructure. @cartermp Since you have worked on building LLMs, we would really like to hear your input on the pain points including the requirements for better tools, better GPU/CPU/Memory resource management, etc. We will have another kick-off meeting for the WG charter. Would you like to subscribe to Slack #wg-artificial-intelligence? A lot of discussion as well as meeting info happen there.

cartermp commented 8 months ago

That's fair @caniszczyk and @cathyhongzhang -- just to clarify, while I've not built LLMs themselves (I don't think fine-tuning counts!), I have certainly been building with LLMs due to their power and incredible accessibility compared to previous generations of ML technologies. There's a whole world of "what do you do when you're in production now" with this stuff that I'd love to contribute towards. I think one of the more natural tie-ins for end-users is Observability (see also: https://github.com/open-telemetry/semantic-conventions/issues/327), so I suppose I do agree there's more opportunities to explore here than I initially thought.

If folks really want to get together and do something that the TOC considers in scope, we should give folks some space to collaborate.

This makes sense to me. Happy to collaborate!

raravena80 commented 8 months ago

@cartermp thanks for collaborating and helping out!

I think one of the more natural tie-ins for end-users is Observability

Yep. I think we likely want to look at these aspects:

  1. How AI (and generative AI) can help/implicate/change cloud native including users, projects, security etc.
  2. How to use cloud native tools/projects to enable AI (and generative AI). This includes K8s, MLOps, LLMOps, GPU enablers, etc.
TheFoxAtWork commented 8 months ago

The working group is still meeting to sort through the scope and the charter. Once we get an updated scope, this issue will be updated to reflect that and once the group is established we'll close this issue. CC @raravena80

paravatha commented 8 months ago

This is super interesting initiative. @johnugeorge and @yuzisun have already mentioned Kubeflow and KServe. I'd like to mention https://argoproj.github.io/. Especially, ArgoCD and Argo Workflows are both widely used in k8s based MLOps Platforms.

I worked in a ML Platform team that enabled Kubeflow+KServe+ArgoCD to build and deploy 100s of deep learning models across different environments.

Looking forward to learning more about this WG and initiative!

nirga commented 8 months ago

Wanted also to mention here OpenLLMetry, a set of extensions built on top of OpenTelemetry to provide open-protocol tracing and monitoring for LLM applications. I'm going to present this on the next TAG Observability.

rajaskakodkar commented 7 months ago

We can close this issue in favour of https://github.com/cncf/toc/issues/1200

nikhita commented 7 months ago

Closing as per @rajaskakodkar's comment. Let's direct any further discussions to #1200