GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
900 stars 231 forks source link

Cloud Scheduler job 400 error when manual change is made #695

Open erik-carlson opened 2 years ago

erik-carlson commented 2 years ago

Checklist

Bug Description

If a cloud scheduler job is created with config connector without specifying a timezone (so that it uses the default) and then the schedule is updated to " *" through the console (which requires choosing the timezone), config connector fails with "Error 400: Schedule or time zone is invalid." when it tries to update the job to the configuration in the config connector resource.

This doesn't seem to happen for all updated schedules, but I believe is consistently reproducible with " *".

Additional Diagnostic Information

na

Kubernetes Cluster Version

v1.22.10-gke.600

Config Connector Version

1.90.0

Config Connector Mode

namespaced mode (default)

Log Output

Update call failed: error applying desired state: googleapi: Error 400: Schedule or time zone is invalid.

Steps to reproduce the issue

see description

YAML snippets

#originally created job config
apiVersion: cloudscheduler.cnrm.cloud.google.com/v1beta1
kind: CloudSchedulerJob
metadata:
  name: "test"
spec:
  location: "us-central1"
  pubsubTarget:
    attributes:
      cursorDelay: "720"
    topicRef:
      name: "observability"
  schedule: "* * 1 1 *"
maqiuyujoyce commented 2 years ago

Hi @erik-carlson , thank you very much for reporting the issue and sorry for the delayed response.

I was able to reproduce the issue, but also found it may not be consistently reproducible, so it'll need some further digging for us to understand the root cause and come up with a solution.

Meanwhile, have you figured out a workaround for your use case? Based on the API's behavior, I think explicitly specifying the timeZone may help work around the issue. Please let us know if it's still causing errors.

erik-carlson commented 2 years ago

I think that explicitly setting the time zone would prevent this issue and that's the approach that we're now taking. In the case above, the failing KCC resource had the time zone set to "UTC", managed by the KCC controller. To get it back to upToDate I had to:

  1. specify the timezone in the resource and set it to a time zone that is not UTC. Setting it to UTC resulted in it staying in a failed state. I actually set the time zone to "utc".
  2. explicitly set the timezone in the KCC resource to "UTC"

At that point I could actually remove that field from the the KCC resource and it stayed upToDate.