kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
109.66k stars 39.27k forks source link

WatchList failed when Accept content-type is Table #126206

Open xuzhenglun opened 1 month ago

xuzhenglun commented 1 month ago

What happened?

When perform a List or WatchList with Accept: application/json;as=Table;v=v1;g=meta.k8s.io, api will convert every objects of response into metav1.Table. And as per the specification of the WatchList, a Bookmark event with k8s.io/initial-events-end should be sent after all initial events have been dispatched. Upon receiving this event, the client should close the watch and return the result.

However, when response in Table format is returned, the annotations are placed on the metav1.Table.Rows instead of the top level of the object. This causes misbehavior in client-go.

# curl 'http://localhost:8888/api/v1/pods?allowWatchBookmarks=true&watch=true&sendInitialEvents=true&resourceVersionMatch=NotOlderThan' -H 'Accept: application/json;as=Table;v=v1;g=meta.k8s.io'
...

{
  "type": "BOOKMARK",
  "object": {
    "kind": "Table",
    "apiVersion": "meta.k8s.io/v1",
    "metadata": {
      "resourceVersion": "457339"
    },
    "columnDefinitions": null,
    "rows": [
      {
        "cells": [
          "",
          "0/0",
          "",
          "0",
          "<unknown>",
          "<none>",
          "<none>",
          "<none>",
          "<none>"
        ],
        "object": {
          "kind": "PartialObjectMetadata",
          "apiVersion": "meta.k8s.io/v1",
          "metadata": {
            "resourceVersion": "457339",
            "creationTimestamp": null,
            "annotations": {
              "k8s.io/initial-events-end": "true"
            }
          }
        }
      }
    ]
  }
}

In the current implementation, an error failed to parse watch event will be thrown from staging/src/k8s.io/client-go/rest/request.go:880, and it will fall back to the standard List method.

What did you expect to happen?

WatchList works properly.

How can we reproduce it (as minimally and precisely as possible)?

It can be reproduced by below go code. by runnint the code, a warning message like W0719 01:33:52.152868 45522 simple.go:308] The watchlist request for /v1, Resource=pods ended with an error, falling back to the standard LIST semantics, err = failed to parse watch event: watch.Event{Type:"ADDED", Object:(*v1.Table)(0x14000694870)} can be found in terminal.

package main

import (
    "context"
    "fmt"
    "os"
    "path"
    "strings"

    corev1 "k8s.io/api/core/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    metav1beta1 "k8s.io/apimachinery/pkg/apis/meta/v1beta1"
    "k8s.io/apimachinery/pkg/runtime/schema"
    "k8s.io/client-go/dynamic"
    "k8s.io/client-go/rest"
    "k8s.io/client-go/tools/clientcmd"
    "k8s.io/klog/v2"
    "k8s.io/kubectl/pkg/scheme"
)

func main() {
    home, err := os.UserHomeDir()
    if err != nil {
        klog.Fatal(err)
    }

    cfg, err := clientcmd.BuildConfigFromFlags("", path.Join(home, ".kube/config"))
    if err != nil {
        klog.Fatal(err)
    }

    cfg.GroupVersion = &schema.GroupVersion{}
    cfg.AcceptContentTypes = strings.Join([]string{
        fmt.Sprintf("application/json;as=Table;v=%s;g=%s", metav1.SchemeGroupVersion.Version, metav1.GroupName),
        fmt.Sprintf("application/json;as=Table;v=%s;g=%s", metav1beta1.SchemeGroupVersion.Version, metav1beta1.GroupName),
        "application/json",
    }, ",")

    gv := corev1.SchemeGroupVersion
    cfg.GroupVersion = &gv
    cfg.APIPath = "/api"
    cfg.NegotiatedSerializer = scheme.Codecs.WithoutConversion()
    cfg.UserAgent = rest.DefaultKubernetesUserAgent()

    client, err := rest.RESTClientFor(cfg)
    if err != nil {
        klog.Fatal(err)
    }

    dynamicClient := dynamic.New(client)
    pods, err := dynamicClient.
        Resource(corev1.SchemeGroupVersion.WithResource("pods")).
        Namespace("kube-system").
        List(context.TODO(), metav1.ListOptions{})
    if err != nil {
        klog.Fatal(err)
    }
    klog.Info(pods)
}

Anything else we need to know?

No response

Kubernetes version

```console $ kubectl version Client Version: v1.30.1 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.31.0-beta.0.25+a70cb76847ada1-dirty ```

Cloud provider

OS version

```console # On Linux: $ cat /etc/os-release # paste output here $ uname -a # paste output here # On Windows: C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture # paste output here ```

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

xuzhenglun commented 1 month ago

/sig api-machinery

cc @p0lyn0mial

xuzhenglun commented 1 month ago

I think this issus has some similarities with https://github.com/kubernetes/kubernetes/pull/126191. Both of these seem to be caused by differences in metadata between the List and the Singleton object.

xuzhenglun commented 1 month ago

friendly ping @p0lyn0mial

Is this expected behavior on the apiserver side, leaving it up to the client to handle it appropriately? WDYT @deads2k

p0lyn0mial commented 1 month ago

It looks like the standard list request decoded the response successfully with the Accept content-type set as Table. Right? I assume that the decoded object was UnstructuredList.

I think that the WatchList should also be able to decode successfully into UnstructuredList regardless of the requested encoding. If the requested encoding is unsupported, then the WatchList request could/should return an error.

xuzhenglun commented 1 month ago

Thank @p0lyn0mial for reply.

Yes, the standard list returns UnstructuredList sucessfully when response is Table, and no error happened. Alrougth Table is not a list, but Unstructured can do ToList method successfully, and the output is &UnstructuredList{Object: obj.Object} (field Items is nil in this case). However, ListWatch return an error for now, and the behavior of List and WatchList are different.

I'm agree with you that List and WatchList should only support object can be decoded into UnstructuredList, but I'm not sure that if this difference can be considered as breaking compatibility?

Beside of the encoding issue, the thing what I'm really worried about is the initial-end bookmark event is strange when response is a table: metadata.annotation is not available in the root of the object, but it's placed in rows[0].object.metadata.annotation. And the worst thing is that even rows[0].object.metadata.annotation can be unavailiable if includeObject=None param is set, since rows[0].object will be nil.

I'm trying to add WatchList support for kubectl, so I was wondering if I could reuse the WatchList code in the rest client instead of implementing one myself. The includeObject=None parameter doesn't matter to me for what I'm doing, but I'm not sure if other people use it. So I wanted to ask your opinion on this.

p0lyn0mial commented 1 month ago

Since we are trying to replace List with WatchList, I think the new method should also be able to decode data, even if the response is a Table.

I had to roll back the graduation of the feature, so it is turned off by default. Before enabling the feature we should consider fixing this issue.

I don't think that placing the initial-end bookmark annotation on rows[0].object is specific to the WatchList. I believe this is where objects are generally placed when a user requests a Table. For example:

{
  "type": "ADDED",
  "object": {
    "kind": "Table",
    "apiVersion": "meta.k8s.io/v1",
    "metadata": {
      "resourceVersion": "774"
    },
    "columnDefinitions": null,
    "rows": [
      {
        "cells": [
          "todo",
          1,
          "22s"
        ],
        "object": {
          "kind": "PartialObjectMetadata",
          "apiVersion": "meta.k8s.io/v1",
          "metadata": {
            "name": "todo",
            "namespace": "example",
            "uid": "2ea9e613-6487-406e-974a-42c6c7cf9a94",
            "resourceVersion": "774",
            "creationTimestamp": "2024-07-22T12:40:32Z",
            "managedFields": [
              {
                "manager": "kubectl-create",
                "operation": "Update",
                "apiVersion": "v1",
                "time": "2024-07-22T12:40:32Z",
                "fieldsType": "FieldsV1",
                "fieldsV1": {
                  "f:data": {
                    ".": {},
                    "f:foo": {}
                  }
                }
              }
            ]
          }
        }
      }
    ]
  }
}
xuzhenglun commented 1 month ago

@p0lyn0mial

Since we are trying to replace List with WatchList, I think the new method should also be able to decode data, even if the response is a Table.

So, we will modify the implementation of WatchList method in rest package so that we can process Table normally, right?

I don't think that placing the initial-end bookmark annotation on rows[0].object is specific to the WatchList. I believe this is where objects are generally placed when a user requests a Table.

Yes, initial-end bookmark annotation is always placed in rows[0].object when requesting a Table. But there are two small problems when we are using WatchList with this behavior:

  1. Table need a special treatment likes below, and it looks weird. It's just a nit, no big deal.

    
    +func metaAccessor(object runtime.Object) (metav1.Object, error) {
    +   table, ok := object.(*metav1.Table)
    +   if ok {
    +       if len(table.Rows) == 0 || len(table.Rows[0].Object.Raw) == 0 {
    +           return nil, fmt.Errorf("unexcept empty Table")
    +       }
    +
    +       converted, err := runtime.Decode(unstructured.UnstructuredJSONScheme, table.Rows[0].Object.Raw)
    +       if err != nil {
    +           return nil, err
    +       }
    +
    +       return meta.Accessor(converted)
    +   }
    +
    +   return meta.Accessor(object)
    +}
    
    // handleWatchList holds the actual logic for easier unit testing.
    // Note that this function will close the passed watch.
    func (r *Request) handleWatchList(ctx context.Context, w watch.Interface) WatchListResult {
            if event.Type == watch.Error {
                return WatchListResult{err: errors.FromObject(event.Object)}
            }
    -           meta, err := meta.Accessor(event.Object)
    +           meta, err := metaAccessor(event.Object)
            if err != nil {
                return WatchListResult{err: fmt.Errorf("failed to parse watch event: %#v", event)}
            }

2. If the WatchList request with `includeObject=None` params, the `initial-end bookmark annotation` will never been seen by client. I'm not sure if we can accept that it's just not supported. For example, the bookmark event will looks like:
```json
{
  "type": "BOOKMARK",
  "object": {
    "kind": "Table",
    "apiVersion": "meta.k8s.io/v1",
    "metadata": {
      "resourceVersion": "679930"
    },
    "columnDefinitions": null,
    "rows": [
      {
        "cells": [
          "",
          "0/0",
          "",
          "0",
          "<unknown>",
          "<none>",
          "<none>",
          "<none>",
          "<none>"
        ],
        "object": null
      }
    ]
  }
}
xuzhenglun commented 1 month ago

@p0lyn0mial

I just submited a PR https://github.com/kubernetes/kubernetes/pull/126363 which trying to address this issue. Would you please take a look? I'm not sure is it good enough since I'm really new to this part of code.

cici37 commented 1 month ago

/triage accepted