Open aledegano opened 2 months ago
Is there some indication from k8s that a resource is being provisioned to satisfy the (currently unschedulable) request?
Shortly after the pod is scheduled I see this event:
apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-04-19T08:05:16Z"
involvedObject:
apiVersion: v1
kind: Pod
name: foo-40bar--an-2daws-2dproject-57f05e85-0
namespace: renku
resourceVersion: "21922912"
uid: 467b94d8-e56a-432c-8ca1-108843aab5ec
kind: Event
lastTimestamp: "2024-04-19T08:05:16Z"
message: 'Pod should schedule on: nodeclaim/core-services-lrkls'
metadata:
creationTimestamp: "2024-04-19T08:05:16Z"
name: foo-40bar--an-2daws-2dproject-57f05e85-0.17c79fd3fcc0e889
namespace: renku
resourceVersion: "21922946"
uid: 072773f3-e2b0-4f73-bd9f-4baca5511b66
reason: Nominated
reportingComponent: karpenter
reportingInstance: ""
source:
component: karpenter
type: Normal
There are certainly more information from Karpenter, but that might be a bit too specific/platform-dependent...
When I resume a session and there are not enough resources available I immediately get an error in the UI.
That makes sense in our current infra since if the resources aren't there they won't be (at least for a while), however on platforms where autoscaling is enabled (like the AWS PoC I'm carrying out), that's not necessarily true, as some resources might be coming up in a short while.
The error itself does not prevent the session to start in the background, but it might be misleading for a user.
Can we control this time interval before showing an error?