Open jacobtomlinson opened 2 years ago
If you try creating the example but set nodePort
to something impossible the controller gets stuck in a loop.
https://kubernetes.dask.org/en/latest/operator_resources.html
https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport
We should add some validation here:
I've tried creating a DaskCluster
object with out-of-range nodeport and I see this:
HTTP response headers: <CIMultiDictProxy('Audit-Id': '4c8966e6-156a-4584-8640-39a7164e9949', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '9f5dce77-f6df-4593-8079-a7ece243e04d', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'f5d11d69-50fc-4b73-8ba4-6e255c69420e', 'Date': 'Wed, 03 May 2023 05:32:24 GMT', 'Content-Length': '210')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"simple-scheduler\" already exists","reason":"AlreadyExists","details":{"name":"simple-scheduler","kind":"pods"},"code":409}
Looks like you have a dask scheduler pod with a conflicting name hanging around. Perhaps left over from a failed test run?
If a
DaskCluster
is configured with aNodePort
service but the ports are out of range theDaskCluster
will be created but the controller logs will error repeatedly.We should do a little more input checking to ensure this can't happen. We should also have the controller put the
DaskCluster
into some kind of failure status while this is going on.