knative / client

Knative developer experience, docs, reference Knative CLI implementation
Apache License 2.0
354 stars 261 forks source link

Option to specify Node Selector #1742

Closed NaxAlpha closed 1 year ago

NaxAlpha commented 2 years ago

Feature request

I want the CLI to provide the option to specify Node Selector for a service.

Use case

I have an ML service that runs on A100 GPUs. The service has two active revisions, beta revision, and stable revision. Whenever there is a new release, it is first deployed to the beta revision, where it is tested. Then it is moved to the prod endpoint with multiple A100 instances.

Right now, I am deploying the beta on one A100 instance. However, it is overkill. I have another node pool with Tesla T4 GPUs where I would like the beta version to be deployed. So for that reason, an option in the CLI to specify nodeSelector would be extremely helpful.

UI Example

Here is an example of how it is deployed right now:

# beta deployment script
kn service update my-app --image=my-app:version-10 --env ENV=stg --scale-min=1 --scale-max=2

# prod deployment script
kn service update my-app --image=my-app:version-10 --env ENV=prd --scale-min=2 --scale-max=1000

Here is what I think would be extremely helpful for my use-case:

# beta deployment
kn service update my-app --image=my-app:version-10 --env ENV=stg --scale-min=1 --scale-max=2 --node-selector-label cloud.google.com/gke-accelerator=nvidia-tesla-t4

# prod deployment
kn service update my-app --image=my-app:version-10 --env ENV=prd --scale-min=2 --scale-max=1000 --node-selector-label cloud.google.com/gke-accelerator=nvidia-tesla-a100
rhuss commented 2 years ago

If I understand the spec in https://github.com/knative/specs/blob/main/specs/serving/knative-api-specification-1.0.md#revision-2 right, there is no possibility to set the nodeSelector on a PodSpec for a Knative Service / Revision, so I'm afraid the client can't help here much.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.