kyma-project / kyma

Kyma is an opinionated set of Kubernetes-based modular building blocks, including all necessary capabilities to develop and run enterprise-grade cloud-native applications.
https://kyma-project.io
Apache License 2.0
1.51k stars 407 forks source link

Enable consumption and configuration of specific hyperscaler resources [EPIC] #18195

Open varbanv opened 10 months ago

varbanv commented 10 months ago

Description

Provide a way for end users to consume and be charged for a pre-defined set of hyperscaler resources:

To have standard machine types configurable in worker pools gets addressed in separate story https://github.com/kyma-project/kyma/issues/18709. It is expected that any additional node specific settings will be added to that concept as further option

Context

Problem

Currently, Kyma is a layer on top of Kubernetes and as such provides a very limited set of infrastructure configuration options at provisioning time. However, customers looking to adopt Kyma that already use existing hyperscaler offerings already take advantage of specialized resources as part of their workloads (for example faster storage, GPU nodes, network optimized nodes, etc). This prevents those users from on-boarding on Kyma without having to re-engineer their workloads.

Benefits

For customers:

For us:

Potential problems

Gathering of Resources to support

Billing requirements

Acceptance criteria

Tasks

Disper commented 8 months ago
marco-porru commented 8 months ago

cKMS team is evaluating the usage of Kyma. The team need to have "confidential computing capabilities". This kind of machine is surely available for azure and gcp

marco-porru commented 7 months ago

SAP for Me would like to use m6g and m6in machine types

valentinvieriu commented 6 months ago

+1 for GPUs

marco-porru commented 6 months ago

+1 for GPUs

Thanks Valentin for reporting it. I think it's worth mentioning the context and let me do it on behalf of you for simplicity 😄 : it's to make it possible for Core AI to run on Kyma (subject to future discussions and agreements)

varbanv commented 4 months ago

Had a preliminary workshop with @tobiscr and @PK85 and added a first set of tasks to work on.

marco-porru commented 3 months ago

+1 team for GPU (ICN Munich)

marco-porru commented 3 months ago

Enable more private connectivity (e.g. via firewall), requested by not less than 3 teams (e.g. S/4HANA ABAP Machines)

marco-porru commented 3 months ago

Enable assured workload GCP module (relevant for KSA), requested by BTP email service

abbi-gaurav commented 3 months ago

A customer is looking for very high IOPS storage. e.g. enabling ultra disks for storage could help them: https://learn.microsoft.com/en-us/azure/virtual-machines/disks-enable-ultra-ssd?tabs=azure-portal

abbi-gaurav commented 2 months ago

At present, customers are able to use resources for which they are not charged such as

We should somehow make the customers aware that they might have to pay for this in the future, so it should not come as a surprise for them.

@NHingerl , could you please help? IMHO, putting this info out might not need to wait until this epic is done.

lanthoor commented 1 month ago

SAP IPR would like to use g5 and r7i instance types along with other hyperscaler resources like ALB/NLB.

pthd commented 1 month ago

+1 for GPU support. AI scenarios required GPU powered instances. More precisely we want to leverage Transformer models which run much faster on GPU.

MarcusNotheis commented 4 weeks ago

We would be interested in OpenSearch consumption

marco-porru commented 2 weeks ago

GPUs for the Product Services team (already LIVE) In particular from GCP A100 H100 H200 machines

marco-porru commented 2 weeks ago

GPUs requested also by NGS (already live)