kubernetes-sigs / kwok

Kubernetes WithOut Kubelet - Simulates thousands of Nodes and Clusters.
https://kwok.sigs.k8s.io
Apache License 2.0
2.38k stars 191 forks source link

Add operator to manage any resource created and deleted #1126

Open wzshiming opened 1 month ago

wzshiming commented 1 month ago

What would you like to be added?


apiVersion: resource.operator.kwok.x-k8s.io/v1alpha1
kind: Resource
metadata:
  name: node-2c4g
spec:
  templateName: node
  replicas: 10
  parameters:
    allocatable:
      cpu: 2
      memory: 4Gi
---
apiVersion: resource.operator.kwok.x-k8s.io/v1alpha1
kind: Resource
metadata:
  name: node-4c8g
spec:
  templateName: node
  replicas: 10
  parameters:
    allocatable:
      cpu: 4
      memory: 8Gi
---
apiVersion: resource.operator.kwok.x-k8s.io/v1alpha1
kind: ResourceTemplate
metadata:
  name: node
spec:
  parameters:
    podCIDR: "10.0.0.1/24"
    allocatable:
      cpu: 32
      memory: 256Gi
      pods: 110
    capacity: {}
    nodeInfo:
      architecture: amd64
      operatingSystem: linux
  template: |-
    kind: Node
    apiVersion: v1
    metadata:
      name: {{ Name }}
      annotations:
        kwok.x-k8s.io/node: fake
        node.alpha.kubernetes.io/ttl: "0"
        metrics.k8s.io/resource-metrics-path: "/metrics/nodes/{{ Name }}/metrics/resource"
      labels:
        beta.kubernetes.io/arch: {{ .nodeInfo.architecture }}
        beta.kubernetes.io/os: {{ .nodeInfo.operatingSystem }}
        kubernetes.io/arch: {{ .nodeInfo.architecture }}
        kubernetes.io/hostname: {{ Name }}
        kubernetes.io/os: {{ .nodeInfo.operatingSystem }}
        kubernetes.io/role: agent
        node-role.kubernetes.io/agent: ""
        type: kwok
    spec:
      podCIDR: {{ AddCIDR .podCIDR Index }}
    status:
      allocatable:
      {{ range $key, $value := .allocatable }}
        {{ $key }}: {{ $value }}
      {{ end }}
      {{ $capacity := .capacity }}
      capacity:
      {{ range $key, $value := .allocatable }}
        {{ $key }}: {{ or ( index $capacity $key ) $value }}
      {{ end }}
      nodeInfo:
      {{ range $key, $value := .nodeInfo }}
        {{ $key }}: {{ $value }}
      {{ end }}

Why is this needed?

https://kubernetes.slack.com/archives/C04RG2YSK16/p1716989513415299?thread_ts=1716796734.402529&cid=C04RG2YSK16

dormullor commented 1 month ago

@wzshiming What do you think about the below API ?

apiVersion: kwok.sigs.k8s.io/v1beta1
kind: NodePool
metadata:
  name: nodepool-sample
spec:
  nodeCount: 3
  nodeTemplate:
    apiVersion: v1
    metadata:
      annotations:
        node.alpha.kubernetes.io/ttl: "0"
      labels:
        kubernetes.io/role: agent
        nvidia.com/gpu.deploy.device-plugin: "true"
        nvidia.com/gpu.deploy.dcgm-exporter: "true"
        type: kwok
    spec: {}
    status:
      allocatable:
        cpu: 32
        memory: 256Gi
        pods: 110
      capacity:
        cpu: 32
        memory: 256Gi
        pods: 110
      nodeInfo:
        architecture: amd64
        bootID: ""
        containerRuntimeVersion: ""
        kernelVersion: ""
        kubeProxyVersion: fake
        kubeletVersion: fake
        machineID: ""
        operatingSystem: linux
        osImage: ""
        systemUUID: ""
      phase: Running

The kwok.x-k8s.io/node: fake annotation and the node taint are automatically added to all nodes.

Simple struct by using the kubernetes corev1 Node object

// NodePoolSpec defines the desired state of NodePool
type NodePoolSpec struct {
    NodeCount    int32       `json:"nodeCount"`
    NodeTemplate corev1.Node `json:"nodeTemplate"`
}
wzshiming commented 1 month ago

This is fine as a first version of the API, but if we have a requirement for a DeploymentPool, or any custom resource pool, do we still need to implement it?

dormullor commented 1 month ago

Can you explain what is a DeploymentPool ? i thought kowk purpose is "fake" nodes

wzshiming commented 1 month ago

https://kwok.sigs.k8s.io/#what-is-kwok

kwok is the cornerstone of this project, responsible for simulating the lifecycle of fake nodes, pods, and other Kubernetes API resources.

The other Kubernetes API resources is also in the goal

Now it's support the simulation of other resources, such as Kubevirt's VMI. so there may be a need for VMI pools in the future

dormullor commented 1 month ago

I see, i think that creating a dedicated API for each resource will benefit if different logic for different resources is needed ( e.g nodepool/ kubevirtPool , etc) Also IMHO, working with templates that are originally strings that converted into structs are very error prone. WDYT ?

wzshiming commented 1 month ago

Also IMHO, working with templates that are originally strings that converted into structs are very error prone.

I'm with you on that one.

In fact, current node and pod simulations are already using templates, then we'll do chaos simulation use it.

As a result, a validation tool was recently added to this batch of simulation stage templates in CI to ensure that the templates are properly

dormullor commented 1 month ago

How can you prevent from a user to apply a malformed template for node ?

wzshiming commented 1 month ago

There is no way, we try not to let the user touch the template, if the user modifies the template, they will be responsible for it.

You see my definition above, in fact the user only needs to modify the Resource, the ResourceTemplate is the default for the user, that is enough!

dormullor commented 1 month ago

Got it, thanks for the info. The way i see it, we can create an API for each type of resource KWOK support & create a generic one that can operate on a CR like you did:

  template: |-
    kind: ANY TYPE OF RESOURCE

How does that sound?

wzshiming commented 1 month ago

we can create an API for each type of resource KWOK support

All resources are now supported on kwok, but we should only need to support node and pod for this.

create a generic one that can operate on a CR like you did:

I don't quite understand this. you mean create a generic Template CR?

dormullor commented 1 month ago

If it's needed, we can create an API for the operator to get any kind of resource as a string, and apply it to kubernetes. Same way you did with the template

wzshiming commented 1 month ago

Got it