volcano-sh / volcano

A Cloud Native Batch System (Project under CNCF)
https://volcano.sh
Apache License 2.0
4.18k stars 962 forks source link

Cannot assign deployment to specific queue by upper annotations. #2731

Open jiangkaihua opened 1 year ago

jiangkaihua commented 1 year ago

What happened:

When I apply a deployment like:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  annotations:
    scheduling.volcano.sh/queue-name: q1
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

I supposed that the nginx deployment would be scheduled to queue q1, but actually it was scheduled to queue default.

What you expected to happen:

The annotations should be inherited by replicaset created by nginx deployment, and when the pgcontroller created podgroup for the pod from replicaset, it should read the annotation and assign the podgroup to queue q1.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

It's probably caused by different prefix volcano.sh/ and scheduling.volcano.sh/:

In inheritUpperAnnotations, it would readin all annotations with prefix volcano.sh/:

https://github.com/volcano-sh/volcano/blob/12a44e3faea7c7dc3332a96d430adae67a7c96e9/pkg/controllers/podgroup/pg_controller_handler.go#L160-L161

So annotations with prefix scheduling.volcano.sh/ would not be readin. But in queue selection, it would select queue by scheduling.volcano.sh/:

https://github.com/volcano-sh/volcano/blob/12a44e3faea7c7dc3332a96d430adae67a7c96e9/pkg/controllers/podgroup/pg_controller_handler.go#L197-L201

Thus causing annotation lost effect, cannot configure queue by upper annotation, only annotations on pod can take effect.

Environment:

halegreen commented 1 year ago

/assign

halegreen commented 1 year ago

@jiangkaihua hi, it seems that you declare the queue name at the wrong position, you should declare queue name at the spec.template.metadata.annotation

here's an example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deploy-with-volcano
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        #create test queue and use this annotation
        #to make deployment be scheduled into test queue
        scheduling.volcano.sh/queue-name: q1
    spec:
      # set spec.schedulerName to 'volcano' instead of
      # 'default-scheduler' for deployment.
      schedulerName: volcano
      containers:
        - name: nginx
          image: nginx:latest
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: 300m

and you can see the pod created by the deployment is at the queue you specified.

jiangkaihua commented 1 year ago

@jiangkaihua hi, it seems that you declare the queue name at the wrong position, you should declare queue name at the spec.template.metadata.annotation

here's an example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deploy-with-volcano
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        #create test queue and use this annotation
        #to make deployment be scheduled into test queue
        scheduling.volcano.sh/queue-name: q1
    spec:
      # set spec.schedulerName to 'volcano' instead of
      # 'default-scheduler' for deployment.
      schedulerName: volcano
      containers:
        - name: nginx
          image: nginx:latest
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: 300m

and you can see the pod created by the deployment is at the queue you specified.

Yes, assign pod to specified queue by spec.template.metadata.annotation can work. But in my opinion it would be better to integrate prefix volcano.sh/ and scheduling.volcano.sh/, so that we could assign queue by annotations in workload level. And it would also avoid potential bugs when different pods from same workload was assigned to different queues, the minResources of podgroup for the workload would be calculated twice in different queue in enqueue action in this scenario.

stale[bot] commented 1 year ago

Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).