Open Cian911 opened 2 days ago
Hey @Cian911 I'm not able to reproduce this locally so far with those values. From experience you get this error if coreRequest
or coreLimit
don't conform to the Kubernetes resource syntax. Do you have any mutating webhooks on the cluster that might mutate the request or limit fields on pod creation?
Description
I've been scratching my head on this one for the past few days - without any resolution.
I am in the process of testing migrating the spark operator from
spark-operator-chart-1.4.6
tov2.0.1
and have come across the following issues. It seems that submission fails at the point it tries to create a driver pod - with the following error around resource quantities:Below is the full error log.
First thing to note on this log line:
ERROR Client: Please check \"kubectl auth can-i create pod\" first. It should be yes.
- the CR is using a serviceAccount that does have the appropriate permissions to perform full CRUD operations to thepods
resource - just to rule that out before anyone asks.There is no change I made to the resource values compared to
spark-operator-chart-1.4.6
andv2.0.1
. My driver & executor resource asks essentially look like this:After enabling debug logs on the operator-controller, I can see that these values are correctly passed in and submitted as
--conf
arguments, but it fails directly after that.This smells to me that it is an issue with
spark:3.5.1
.. But I am not entirely sure. I will post the fullSparkApplication
below for reference.Reproduction Code [Required]
Expected behavior
Driver & Executor pods should spin up and job should start.
Actual behavior
Job submission fails.
Terminal Output Screenshot(s)
Environment & Versions
v2.0.1
v2.0.1
v1.29.3
3.4.1
Additional context
cc: @ChenYi015 @jacobsalway