Open wawa0210 opened 3 months ago
TAG-Runtime
Project repo URL in scope of application
lists just the main repo, are the other repos out of scope for donation?k8s-dra-driver
fork for convenience or is it really going to be a fork?
Project repo URL in scope of application
lists just the main repo, are the other repos out of scope for donation?- is the
k8s-dra-driver
fork for convenience or is it really going to be a fork?
all public repos are on the scope for donation
k8s-dra-driver are forked for convenience, we plan to make our own dra-driver
Project repo URL in scope of application
lists just the main repo, are the other repos out of scope for donation?- is the
k8s-dra-driver
fork for convenience or is it really going to be a fork?
We've been exploring the combination of HAMi and DRA and are currently in the roadmap as well
Application contact emails
limengxuan@4paradigm.com, xiaozhang0210@hotmail.com
Project Summary
Heterogeneous AI Computing Virtualization Middleware (HAMi), is an "all-in-one" tool designed to manage Heterogeneous AI Computing Devices in a k8s cluster.
Project Description
Heterogeneous AI Computing Virtualization Middleware (HAMi) is an "all-in-one" tool designed to manage Heterogeneous AI Computing Devices in a k8s cluster. It includes everything you would expect, such as:
nvidia.com/use-gputype
ornvidia.com/nouse-gputype
.nvidia.com/use-gpuuuid
ornvidia.com/nouse-gpuuuid
.nvidia.com/gpu
if you prefer.The core features of HAMi are as follows
The HAMi architecture is as follows
Application Scenarios
Org repo URL (provide if all repos under the org are in scope of the application)
https://github.com/Project-HAMi
Project repo URL in scope of application
core repo : https://github.com/Project-HAMi/HAMi
And the corresponding multi-public repo https://github.com/Project-HAMi/
Additional repos in scope of the application
No response
Website URL
http://project-hami.io/
Roadmap
https://github.com/Project-HAMi/HAMi?tab=readme-ov-file#roadmap
Roadmap context
Contributing Guide
https://github.com/Project-HAMi/HAMi/blob/master/CONTRIBUTING.md
Here are our community meeting minutes
https://docs.google.com/document/d/1YC6hco03_oXbF9IOUPJ29VWEddmITIKIfSmBX8JtGBw/edit?usp=sharing
Code of Conduct (CoC)
https://github.com/Project-HAMi/HAMi/blob/master/CODE_OF_CONDUCT.md
Adopters
We have done a survey and found that dozens of adopters are already using HAMi. We will maintain it in the HAMi documentation later. Online survey results
Contributing or Sponsoring Org
4paradigm,DaoCloud, HuaweiCloud,Rise Union
Maintainers file
https://github.com/Project-HAMi/HAMi/blob/master/MAINTAINERS.md
IP Policy
Trademark and accounts
Why CNCF?
The CNCF is the premier organization for cloud-native technologies and is backed by many leading companies in the industry. It also provides a platform for collaboration and community-building, which can lead to increased visibility, adoption, and contributions to HAMi.
At the same time, HAMi can be combined with more outstanding CNCF projects (such as: Volcano, Kuberay, Kueue) to provide one-stop service for AI infrastructure.
Benefit to the Landscape
As AI becomes more and more popular, different smart devices are springing up, represented by Nvidia, but there are many other smart devices that are also actively embracing K8s and CNCF. But how these numerous GPUs, NPUs and other devices can provide a consistent interactive experience on one platform is particularly important. This is exactly what HAMi is focused on doing. If users use HAMi, it will greatly simplify the management and operation of these GPUs and NPUs on K8s, and the application layer does not need to be aware of the differences in underlying hardware.
Cloud Native 'Fit'
HAMi is built using cloud native technology. It has now used scheduler-plugin, webhook, device-plugin and other technologies to manage and schedule heterogeneous AI computing devices. In the future, it will consider using DRA for architecture optimization.
Cloud Native 'Integration'
HAMi refers to the nvidia device-plugin project part of source codes to support nvidia gpu basic features. On top of this, we support the following functions for nvidia gpu extensions.
nvidia.com/use-gputype
ornvidia.com/nouse-gputype
.nvidia.com/use-gpuuuid
ornvidia.com/nouse-gpuuuid
.Cloud Native Overlap
We do not think there is direct overlap at this time with other CNCF projects. However, we do touch on some of the areas that other projects are investigating in the space of device-pluginοΌand scheduler enhancement.
Volcano also provides the ability to share GPUs. In version v1.8, the features of volcano-vgpu were contributed to the volcano repo by hami maintainer. However, after discussions with the maintainer of volcano, in order to support the independent development of the hami community, it was decided to release it in version v1.9. Later, this part of the function was transferred to the HAMi project and maintained by the HAMi community (repo --> https://github.com/Project-HAMi/volcano-vgpu-device-plugin)
Similar projects
Some comparisons with similar projects to HAMi
highlight
Comparison of GPU sharing solutions
Landscape
yes
HAMi is in landscape and also in cnai group
https://landscape.cncf.io/?group=cnai
Business Product or Service to Project separation
N/A
Project presentations
No response
Project champions
No response
Additional information
No response