kcp-dev / contrib-tmc

An experimental add-on readding some Kubernetes compute APIs and impement transparent multi-cluster scheduling
Apache License 2.0
5 stars 3 forks source link

bug: jobs.batch does not get synced #100

Open pdettori opened 2 years ago

pdettori commented 2 years ago

Describe the bug

Trying to use the syncer to sync batch.jobs resources, but get this message in the syncer logs:

I0912 13:03:00.523032       1 syncer.go:171] Attempting to retrieve GVRs from upstream...
E0912 13:03:00.528186       1 syncer.go:179] Failed to retrieve GVRs from kcp: the following resource types were requested to be synced, but were not found in the KCP logical cluster: [jobs.batch]

Steps To Reproduce

  1. Build kcp v0.8.2
  2. Start kcp, create a new org and workspace, enter the workspace
  3. Generate syncer deployment yaml with command:
    kubectl kcp workload sync chost1 \
    --replicas=1 --resources=deployments.apps,services,jobs.batch \
    --syncer-image=ghcr.io/kcp-dev/kcp/syncer:v0.8.2 --output-file ./syncer-deployment.yaml
  4. Deploy syncer on pcluster and verify it starts
  5. Get log for syncer on pcluster, e.g.
    kubectl logs -n kcp-syncer-chost1-1rbrc02q kcp-syncer-chost1-1rbrc02q-548c6ffdf8-jdhqs 
  6. Check error above is present.

Expected Behaviour

I expect batch.jobs API to be imported and be able to sync batch.job resources to pcluster

Additional Context

Testing with kcp v0.8.2

stevekuznetsov commented 2 years ago

cc @davidfestal

davidfestal commented 2 years ago

Is there anything meaningful in the KCP logs or at the beginning of the syncer logs ?

pdettori commented 2 years ago

@davidfestal not really, this is what I see in the log:

k logs -n kcp-syncer-chost1-1wxg4cnn kcp-syncer-chost1-1wxg4cnn-f687cc8c5-pqdvt 
I0912 18:53:27.662351       1 syncer.go:78] Syncing the following resource types: [configmaps deployments.apps jobs.batch secrets serviceaccounts services]
I0912 18:53:27.663098       1 syncer.go:74] Starting syncer for SyncTarget: root:myorg:tenant1|chost1
I0912 18:53:27.663243       1 syncer.go:91] Attempting to retrieve the Syncer virtual workspace URL from SyncTarget root:myorg:tenant1|chost1
I0912 18:53:27.740222       1 syncer.go:171] Attempting to retrieve GVRs from upstream...
E0912 18:53:27.746028       1 syncer.go:179] Failed to retrieve GVRs from kcp: the following resource types were requested to be synced, but were not found in the KCP logical cluster: [deployments.apps jobs.batch services]
I0912 18:53:27.840913       1 apiimporter.go:139] Starting API Importer for location chost1 in cluster root:myorg:tenant1
I0912 18:53:27.840971       1 apiimporter.go:170] Importing APIs from location chost1 in logical cluster root:myorg:tenant1 (resources=[configmaps deployments.apps jobs.batch secrets serviceaccounts services])
I0912 18:53:28.747047       1 syncer.go:171] Attempting to retrieve GVRs from upstream...
E0912 18:53:28.752436       1 syncer.go:179] Failed to retrieve GVRs from kcp: the following resource types were requested to be synced, but were not found in the KCP logical cluster: [deployments.apps jobs.batch services]
I0912 18:53:29.475192       1 request.go:628] Waited for 1.032306833s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/apis/autoscaling/v2beta2?timeout=32s
I0912 18:53:29.746730       1 syncer.go:171] Attempting to retrieve GVRs from upstream...
E0912 18:53:29.751742       1 syncer.go:179] Failed to retrieve GVRs from kcp: the following resource types were requested to be synced, but were not found in the KCP logical cluster: [deployments.apps jobs.batch services]
I0912 18:53:30.675134       1 request.go:628] Waited for 1.032576115s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/apis/node.k8s.io/v1?timeout=32s
I0912 18:53:30.747012       1 syncer.go:171] Attempting to retrieve GVRs from upstream...
E0912 18:53:30.753403       1 syncer.go:179] Failed to retrieve GVRs from kcp: the following resource types were requested to be synced, but were not found in the KCP logical cluster: [deployments.apps jobs.batch services]
I0912 18:53:30.776110       1 discovery.go:164] ignoring a resource since it is part of the core KCP resources: secrets (/v1, Kind=Secret)
I0912 18:53:30.776130       1 discovery.go:164] ignoring a resource since it is part of the core KCP resources: serviceaccounts (/v1, Kind=ServiceAccount)
I0912 18:53:30.776139       1 discovery.go:164] ignoring a resource since it is part of the core KCP resources: configmaps (/v1, Kind=ConfigMap)
I0912 18:53:30.776146       1 discovery.go:182] processing discovery for resource services (services.core)
I0912 18:53:30.777672       1 discovery.go:182] processing discovery for resource deployments (deployments.apps)
I0912 18:53:30.780099       1 discovery.go:182] processing discovery for resource jobs (jobs.batch)
I0912 18:53:30.782705       1 apiimporter.go:281] Creating APIResourceImport root:myorg:tenant1|services.chost1.v1.core
I0912 18:53:30.793072       1 apiimporter.go:281] Creating APIResourceImport root:myorg:tenant1|deployments.chost1.v1.apps
I0912 18:53:30.833231       1 apiimporter.go:281] Creating APIResourceImport root:myorg:tenant1|jobs.chost1.v1.batch
I0912 18:53:31.746638       1 syncer.go:171] Attempting to retrieve GVRs from upstream...
E0912 18:53:31.751838       1 syncer.go:179] Failed to retrieve GVRs from kcp: the following resource types were requested to be synced, but were not found in the KCP logical cluster: [jobs.batch]
pdettori commented 2 years ago

On the kcp side all api resources imports seem to be created fine:

k get apiresourceimports.apiresource.kcp.dev 
NAME
deployments.chost1.v1.apps
jobs.chost1.v1.batch
services.chost1.v1.core
davidfestal commented 2 years ago

Do you think you could provide the result of the following commands:

k get apiexport kubernetes -o yaml
k get apiresourceschemas

?

pdettori commented 2 years ago

@davidfestal sure, this is what I get:

$ k get apiexport kubernetes -o yaml
apiVersion: apis.kcp.dev/v1alpha1
kind: APIExport
metadata:
  annotations:
    kcp.dev/cluster: root:myorg:tenant1
    workload.kcp.dev/skip-default-object-creation: "true"
  creationTimestamp: "2022-09-13T15:21:28Z"
  generation: 4
  name: kubernetes
  resourceVersion: "551"
  uid: 16dc39d2-eada-48fe-9f6f-6ab1ecd3aa15
spec:
  identity:
    secretRef:
      name: kubernetes
      namespace: kcp-system
  latestResourceSchemas:
  - rev-540.deployments.apps
  - rev-536.services.core
status:
  conditions:
  - lastTransitionTime: "2022-09-13T15:21:28Z"
    status: "True"
    type: IdentityValid
  - lastTransitionTime: "2022-09-13T15:21:28Z"
    status: "True"
    type: VirtualWorkspaceURLsReady
  identityHash: 6de97fd78fceb2551600cfa5d3d3adbe238f05dd2e629d4b00bfaeb32d6a1fff
  virtualWorkspaces:
  - url: https://9.31.110.127:6445/services/apiexport/root:myorg:tenant1/kubernetes
  $ k get apiresourceschemas
NAME                       AGE
rev-536.services.core      10m
rev-540.deployments.apps   10m
davidfestal commented 2 years ago

We have to find out why the jobs APIResourceSchema resource was not created from the corresponding APIResourceImport and added to the kubernetes APIExport as for deployments.

cc @sttts @qiujian16

Next commands to further troubleshoot this:

k get apiresourceimports -o wide
k get negotiatedapiresources -o wide
pdettori commented 2 years ago

@davidfestal here they are:

$ k get apiresourceimports -o wide
NAME                         LOCATION   SCHEMA UPDATE STRATEGY   API VERSION   API RESOURCE   COMPATIBLE   AVAILABLE
deployments.chost1.v1.apps   chost1     UpdateUnpublished        apps/v1       deployments    True         
jobs.chost1.v1.batch         chost1     UpdateUnpublished        batch/v1      jobs           True         
services.chost1.v1.core      chost1     UpdateUnpublished        v1            services       True       
k get negotiatedapiresources -o wide
NAME                  PUBLISH   API VERSION   API RESOURCE   PUBLISHED   ENFORCED
deployments.v1.apps             apps/v1       deployments                
jobs.v1.batch                   batch/v1      jobs                       
services.v1.core                v1            services          
roivaz commented 1 year ago

Hi @davidfestal, I have hit this problem too. I have made tests with both kcp 0.8.2 and 0.9.1 and in both cases the apiresourceschema for jobs.batch never gets created. The error I see in the kcp logs is:

I1024 14:17:43.490890  224873 workload_apiexport_reconcile.go:132] "missing or outdated schema on APIExport, adding" reconciler="kcp-workload-apiexport" key="root|kubernetes" apiexport.workspace="root" apiexport.namespace="" apiexport.name="kubernetes" apiexport.apiVersion="" schema="jobs.batch"
E1024 14:17:43.514225  224873 workload_apiexport_controller.go:212] "kcp-workload-apiexport" controller failed to sync "root|kubernetes", err: apiresourceschemas.apis.kcp.dev "rev-542.jobs.batch" is forbidden: [spec.versions[0].schema.openAPIV3Schema.properties[status].properties[conditions].x-kubernetes-list-type: Invalid value: "atomic": must be map if x-kubernetes-list-map-keys is non-empty]

The syncer logs show the same error posted by @pdettori for v0.8.2. For v0.9.1 I see no errors in the syncer logs, which is weird.

pdettori commented 1 year ago

I tested with the latest from main ( abdae2bf0037d32e4fe6d3f71f161ec1fa8d5892) The error found by @roivaz is still showing up in the log:

apiresourceschemas.apis.kcp.io "rev-773.jobs.batch" is forbidden: [spec.versions[0].schema.openAPIV3Schema.properties[status].properties[conditions].x-kubernetes-list-type: Invalid value: "atomic": must be map if x-kubernetes-list-map-keys is non-empty]

@davidfestal any idea what may be causing this?

mjudeikis commented 10 months ago

/transfer-issue contrib-tmc