Closed divyaaswath closed 3 years ago
We have created an issue in Pivotal Tracker to manage this:
https://www.pivotaltracker.com/story/show/175611515
The labels on this github issue will be updated when the story is started.
This is being discussed on slack: https://cloudfoundry.slack.com/archives/C1BQKKNP4/p1604340015095600
Yes @manno I am aware of that. Have provided the logs and issue details in the slack as well but have not received a solution yet. If there is a workaround which is available for us to use that would help too. Please let me know.
I can't reproduce the issue here. Running a k3s cluster with multiple KubeCF just fine. I had to workaround some issues present in KubeCF, like https://github.com/cloudfoundry-incubator/kubecf/issues/1582 , but here is my cluster state:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system metrics-server-7b4f8b595-hh8qg 1/1 Running 0 20h
nginx-ingress svclb-nginx-ingress-ingress-nginx-controller-hm5ss 2/2 Running 0 20h
nginx-ingress svclb-nginx-ingress-ingress-nginx-controller-zq5x4 2/2 Running 0 20h
kube-system local-path-provisioner-7ff9579c6-fmbzg 1/1 Running 2 20h
nginx-ingress nginx-ingress-ingress-nginx-controller-77d97c57b6-qhn8n 1/1 Running 0 20h
kube-system coredns-66c464876b-942kq 1/1 Running 0 20h
nginx-ingress svclb-nginx-ingress-ingress-nginx-controller-mr8hg 2/2 Running 2 20h
cf-operator cf-operator-quarks-job-556455b9ff-x85xc 1/1 Running 0 30m
cf-operator cf-operator-quarks-secret-66856b4648-lbglv 1/1 Running 0 30m
cf-operator cf-operator-6db597568b-vcvrz 1/1 Running 0 30m
kubecf bosh-dns-7b59bdd66d-w4488 1/1 Running 0 27m
kubecf bosh-dns-7b59bdd66d-grlv8 1/1 Running 0 27m
kubecf cf-apps-dns-dcb9687ff-f4stn 1/1 Running 0 29m
kubecf database-0 2/2 Running 0 27m
kubecf database-seeder-35e960a317320783-vjthw 0/2 Completed 0 27m
kubecf doppler-0 6/6 Running 0 25m
kubecf nats-0 7/7 Running 0 25m
kubecf diego-api-0 9/9 Running 2 25m
kubecf log-api-0 9/9 Running 0 25m
kubecf auctioneer-0 6/6 Running 1 25m
kubecf singleton-blobstore-0 8/8 Running 0 25m
kubecf uaa-0 9/9 Running 0 25m
kubecf tcp-router-0 7/7 Running 0 25m
kubecf routing-api-0 6/6 Running 0 25m
kubecf log-cache-0 10/10 Running 0 25m
kubecf api-0 17/17 Running 1 25m
kubecf router-0 7/7 Running 1 25m
kubecf cc-worker-0 6/6 Running 0 25m
kubecf credhub-0 8/8 Running 0 25m
kubecf scheduler-0 13/13 Running 1 25m
kubecf diego-cell-0 12/12 Running 2 25m
foo bosh-dns-7b59bdd66d-slzgr 1/1 Running 0 17m
foo cf-apps-dns-6497db99c5-2w5j2 1/1 Running 0 19m
foo bosh-dns-7b59bdd66d-hrhzw 1/1 Running 0 17m
foo database-0 2/2 Running 0 17m
foo database-seeder-3f3ba967274250e9-q7vj7 0/2 Completed 0 17m
foo doppler-0 6/6 Running 0 15m
foo nats-0 7/7 Running 0 15m
foo diego-api-0 9/9 Running 2 15m
foo log-api-0 9/9 Running 0 15m
foo singleton-blobstore-0 8/8 Running 0 15m
foo auctioneer-0 6/6 Running 1 15m
foo uaa-0 9/9 Running 0 15m
foo routing-api-0 6/6 Running 0 15m
foo tcp-router-0 7/7 Running 0 15m
foo log-cache-0 10/10 Running 0 15m
foo router-0 7/7 Running 2 15m
foo credhub-0 8/8 Running 0 15m
foo api-0 17/17 Running 1 15m
foo cc-worker-0 6/6 Running 0 15m
foo scheduler-0 13/13 Running 1 15m
foo diego-cell-0 12/12 Running 13 15m
@divyaaswath could be related to the rolebinding setup? can you show how are you deploying KubeCF in different namespaces?
@mudler I also faced the ClusterRole issue which you have reported above, but just overrode the annotation for cluster role for the subsequent environment and proceeded further. Also, as per the documentation, there is a need for us to create a namespace with specific labels, service account in that namespace and role binding for each of the namespace where we want kubecf to be deployed. So here is the set of details which gets run for every environment (xxx keeps changing for every env):
cat <<_EOF_ | oc create -f -
apiVersion: v1
kind: Namespace
metadata:
name: xxx
labels:
quarks.cloudfoundry.org/monitored: cfo
quarks.cloudfoundry.org/qjob-service-account: qjob-persist-output
spec:
finalizers:
- kubernetes
_EOF_
cat <<_EOF_ | oc create -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: qjob-persist-output
namespace: xxx
_EOF_
cat <<_EOF_ | oc create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: qjob-persist-output-xxx
namespace: xxx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: qjob-persist-output
subjects:
- kind: ServiceAccount
name: qjob-persist-output
_EOF_
Let me know if there is an issue with these settings.. Thanks! Also, can you push any app to both environments?? Let me know..
I can reproduce the issue with Eirini enabled. Diego is not affected as far as I can tell. To note, in my case was enough to setup the quarks-operator with more than one namespace, and deploy on the first one. The ruby error stack trace points to https://github.com/cloudfoundry/cloud_controller_ng/blob/master/lib/cloud_controller/opi/apps_client.rb#L23 , but I have inspected the cloud-controller-clock container and seems to have the correct opi configuration endpoints.
I've also tried to contact opi from a different container in the same pod, and dns was working as intended
I'm debugging further now and checking what's the difference with a deployment on a single namespace, but from the quarks-operator perspective shouldn't matter. So I start to suspect must be something not tuned correctly on KubeCF side.
here is the full stacktrace :
/:/var/vcap/jobs/cloud_controller_clock# /var/vcap/jobs/cloud_controller_clock/bin/cloud_controller_clock
I, [2020-11-18T11:58:36.656928 #939] INFO -- : Starting clock for 17 events: [ app_usage_events.job audit_events.job failed_jobs.job service_usage_events.job completed_tasks.job expired_blob_cleanup.job expired_resource_cleanup.job expired_orphaned_blob_cleanup.job orph
aned_blobs_cleanup.job pollable_job_cleanup.job request_counts_cleanup.job prune_completed_deployments.job prune_completed_builds.job prune_excess_app_revisions.job pending_droplets.job pending_builds.job diego_sync.job ]
I, [2020-11-18T11:58:36.657512 #939] INFO -- : Triggering 'pending_droplets.job'
I, [2020-11-18T11:58:36.663295 #939] INFO -- : Triggering 'pending_builds.job'
I, [2020-11-18T11:58:36.668294 #939] INFO -- : Triggering 'diego_sync.job'
#<HTTP::Message:0x000056307b4d66b8>
E, [2020-11-18T11:58:36.882341 #939] ERROR -- : undefined method `[]' for nil:NilClass (NoMethodError)
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/opi/apps_client.rb:26:in `fetch_scheduling_infos'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/diego/processes_sync.rb:22:in `sync'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:17:in `block in perform'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/statsd-ruby-1.4.0/lib/statsd.rb:412:in `time'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:16:in `perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/wrapping_job.rb:11:in `perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:13:in `block in perform'
/var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:93:in `block in timeout'
/var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `block in catch'
/var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `catch'
/var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `catch'
/var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:108:in `timeout'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:12:in `perform'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:81:in `block in invoke_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:78:in `invoke_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:19:in `block (2 levels) in enqueue_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:17:in `block in enqueue_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:16:in `tap'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:16:in `enqueue_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:12:in `enqueue'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:31:in `block in run_inline'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:56:in `run_immediately'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:30:in `run_inline'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:51:in `block in schedule_frequent_inline_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:58:in `block in schedule_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block (2 levels) in schedule_periodic_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_executor.rb:30:in `execute_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block in schedule_periodic_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:58:in `execute'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:41:in `block in run'
#<Thread:0x00007f029001dbc0@/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:40 run> terminated with exception (report_on_exception is true):
Traceback (most recent call last):
35: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:41:in `block in run'
34: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:58:in `execute'
33: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block in schedule_periodic_job'
32: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_executor.rb:30:in `execute_job'
31: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block (2 levels) in schedule_periodic_job'
30: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:58:in `block in schedule_job'
29: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:51:in `block in schedule_frequent_inline_job'
28: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:30:in `run_inline'
27: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:56:in `run_immediately'
26: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:31:in `block in run_inline'
25: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:12:in `enqueue'
24: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:16:in `enqueue_job'
23: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:16:in `tap'
22: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:17:in `block in enqueue_job'
21: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
20: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
19: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
18: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:19:in `block (2 levels) in enqueue_job'
17: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:78:in `invoke_job'
16: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
15: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
14: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
13: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:81:in `block in invoke_job'
12: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:12:in `perform'
11: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:108:in `timeout'
10: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `catch'
9: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `catch'
8: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `block in catch'
7: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:93:in `block in timeout'
6: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:13:in `block in perform'
5: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/wrapping_job.rb:11:in `perform'
4: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:16:in `perform'
3: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/statsd-ruby-1.4.0/lib/statsd.rb:412:in `time'
2: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:17:in `block in perform'
1: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/diego/processes_sync.rb:22:in `sync'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/opi/apps_client.rb:26:in `fetch_scheduling_infos': undefined method `[]' for nil:NilClass (NoMethodError)
rake aborted!
NoMethodError: undefined method `[]' for nil:NilClass
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/opi/apps_client.rb:26:in `fetch_scheduling_infos'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/diego/processes_sync.rb:22:in `sync'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:17:in `block in perform'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/statsd-ruby-1.4.0/lib/statsd.rb:412:in `time'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:16:in `perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/wrapping_job.rb:11:in `perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:13:in `block in perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:12:in `perform'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:81:in `block in invoke_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:78:in `invoke_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:56:in `run_immediately'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:30:in `run_inline'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:51:in `block in schedule_frequent_inline_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:58:in `block in schedule_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block (2 levels) in schedule_periodic_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_executor.rb:30:in `execute_job'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block in schedule_periodic_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:58:in `execute'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:41:in `block in run'
#<Thread:0x00007f029001dbc0@/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:40 run> terminated with exception (report_on_exception is true):
Traceback (most recent call last):
35: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:41:in `block in run'
34: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/clockwork-2.0.4/lib/clockwork/event.rb:58:in `execute'
33: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block in schedule_periodic_job'
32: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_executor.rb:30:in `execute_job'
31: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/distributed_scheduler.rb:12:in `block (2 levels) in schedule_periodic_job'
30: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:58:in `block in schedule_job'
29: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/clock/clock.rb:51:in `block in schedule_frequent_inline_job'
28: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:30:in `run_inline'
27: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:56:in `run_immediately'
26: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/enqueuer.rb:31:in `block in run_inline'
25: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:12:in `enqueue'
24: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:16:in `enqueue_job'
23: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:16:in `tap'
22: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:17:in `block in enqueue_job'
21: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
20: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
19: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
18: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:19:in `block (2 levels) in enqueue_job'
17: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:78:in `invoke_job'
16: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
15: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
14: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
13: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:81:in `block in invoke_job'
12: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:12:in `perform'
11: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:108:in `timeout'
10: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `catch'
9: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `catch'
8: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:33:in `block in catch'
7: from /var/vcap/packages/ruby-2.5.5-r0.10.0/lib/ruby/2.5.0/timeout.rb:93:in `block in timeout'
6: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:13:in `block in perform'
5: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/wrapping_job.rb:11:in `perform'
4: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:16:in `perform'
3: from /var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/statsd-ruby-1.4.0/lib/statsd.rb:412:in `time'
2: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:17:in `block in perform'
1: from /var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/diego/processes_sync.rb:22:in `sync'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/opi/apps_client.rb:26:in `fetch_scheduling_infos': undefined method `[]' for nil:NilClass (NoMethodError)
rake aborted!
NoMethodError: undefined method `[]' for nil:NilClass
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/opi/apps_client.rb:26:in `fetch_scheduling_infos'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/lib/cloud_controller/diego/processes_sync.rb:22:in `sync'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:17:in `block in perform'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/statsd-ruby-1.4.0/lib/statsd.rb:412:in `time'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/diego/sync.rb:16:in `perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/wrapping_job.rb:11:in `perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:13:in `block in perform'
/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/app/jobs/timeout_job.rb:12:in `perform'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:81:in `block in invoke_job'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:61:in `block in initialize'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:66:in `execute'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/var/vcap/packages/cloud_controller_ng/gem_home/ruby/2.5.0/gems/delayed_job-4.1.8/lib/delayed/backend/base.rb:78:in `invoke_job'
After debugging with @manno we found out that the cc-worker receives an internal server error from opi and when this happens in the Eirini pod we can see in the logs:
W1118 12:45:49.023665 1 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
{"timestamp":"2020-11-18T12:45:49.026286043Z","level":"info","source":"handler","message":"handler.opi-connected","data":{}}
{"timestamp":"2020-11-18T12:47:27.541348858Z","level":"debug","source":"handler","message":"handler.list-apps.requested","data":{"session":"2"}}
{"timestamp":"2020-11-18T12:47:27.690538039Z","level":"error","source":"desirer","message":"desirer.list.failed-to-list-statefulsets","data":{"error":"statefulsets.apps is forbidden: User \"system:serviceaccount:kubecf:opi\" cannot list resource \"statefulsets\" in API group \"apps\" at the cluster scope","session":"1"}}
{"timestamp":"2020-11-18T12:47:27.690665343Z","level":"error","source":"handler","message":"handler.list-apps.bifrost-failed","data":{"error":"failed to list desired LRPs: failed to list statefulsets: statefulsets.apps is forbidden: User \"system:serviceaccount:kubecf:opi\" cannot list resource \"statefulsets\" in API group \"apps\" at the cluster scope","session":"2"}}
Looks like the cluster role configuration needed is in the kubecf namespace instead of the eirini one, see: https://github.com/cloudfoundry-incubator/kubecf/blob/master/mixins/eirini/templates/eirini-cluster-role.yaml#L66 .
@divyaaswath I've opened https://github.com/cloudfoundry-incubator/kubecf/issues/1602 to track the bug in KubeCF, and will close this as this sounds a configuration issue rather than a Quarks bug.
Thanks @mudler !!
Describe the bug Followed the documentation to set up multiple namespaces for kubecf v2.6.1 using a single operator. Results are not as expected. All environment except the last one fails with scheduler-0 pod getting into CrashLoopBackOff status. The pod's first container
cloud-controller-clock
fails with the following error:To Reproduce Set up single-operator to support multiple namespaces as documented at https://quarks.suse.dev/docs/quarks-operator/install/
Expected behavior All environments should be up and running for kubecf v2.6.1 and managed by a single cf-operator
Environment
Additional context Installation is done on OpenShift version 4.4