kubernetes-sigs / lws

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
Apache License 2.0
121 stars 23 forks source link

Add group size as an env var by default #203

Closed ahg-g closed 1 week ago

ahg-g commented 2 weeks ago

What would you like to be added:

Add the group size as an env var

Why is this needed:

In most cases for multi-host inference, the size is needed, like in vllm.

Suggest to use LWS_GROUP_SIZE as the env var name.

Completion requirements:

This enhancement requires the following artifacts:

The artifacts should be linked in subsequent comments.

liurupeng commented 1 week ago

+1, this will be very helpful!

googs1025 commented 1 week ago

Can I try it? I will spend some time on it :)

googs1025 commented 1 week ago

/assign

googs1025 commented 1 week ago

friendly ping :) @ahg-g @liurupeng This is an example I got. I want to know one thing. What we need to do is inject the spec.leaderWorkerTemplate.size field into the container, right?

apiVersion: leaderworkerset.x-k8s.io/v1
kind: LeaderWorkerSet
metadata:
  labels:
    app.kubernetes.io/name: leaderworkerset
    app.kubernetes.io/instance: leaderworkerset-multi-template
    app.kubernetes.io/part-of: lws
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: lws
  name: leaderworkerset-multi-template
spec:
  replicas: 3
  leaderWorkerTemplate:
    leaderTemplate:
      spec:
        containers:
        - name: nginx2
          image: nginx:1.14.2
          resources:
            limits:
              cpu: "100m"
            requests:
              cpu: "50m"
          ports:
          - containerPort: 8080
    size: 4
    workerTemplate:
      spec:
        containers:
        - name: nginx
          image: nginx:1.14.2
          resources:
            limits:
              cpu: "100m"
            requests:
              cpu: "50m"
          ports:
          - containerPort: 8080
googs1025 commented 1 week ago

If so, I plan to use this annotationKey directly to implement this feature

// Size will be added to leader pods as an annotation which corresponds to
// LeaderWorkerSet.Spec.LeaderWorkerTemplate.Size.
SizeAnnotationKey string = "leaderworkerset.sigs.k8s.io/size"
liurupeng commented 1 week ago

thanks for picking it up @googs1025

googs1025 commented 1 week ago

It's done now. If we have any questions, feel free to reopen. :)

googs1025 commented 1 week ago

/close

k8s-ci-robot commented 1 week ago

@googs1025: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/lws/issues/203#issuecomment-2332905577): >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.