kmesh-net / kmesh

High Performance ServiceMesh Data Plane Based on Programmable Kernel
https://kmesh.net
Apache License 2.0
424 stars 58 forks source link

kmesh do not support bookinfo example route #553

Open lec-bit opened 2 months ago

lec-bit commented 2 months ago

What happened: when I start kmesh in ads mode, try bookinfo examples, traffic-management,I don't get the expected result and kmesh logs:

[root@localhost ~]# kubectl logs -f -n kmesh-system kmesh-fwm7k 
cp: cannot create regular file '/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64/kmesh.ko': Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.dep.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.dep.bin.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.alias.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.alias.bin.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.softdep.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.symbols.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.symbols.bin.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.builtin.bin.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.builtin.alias.bin.17.851900.1720665190, 301, 644): Read-only file system
depmod: ERROR: openat(/lib/modules/6.1.19-7.0.0.17.oe2303.x86_64, modules.devname.17.851900.1720665190, 301, 644): Read-only file system
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --bpf-fs-path=\"/sys/fs/bpf\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --cgroup2-path=\"/mnt/kmesh_cgroup2\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --cni-etc-path=\"/etc/cni/net.d\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --conflist-name=\"\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --enable-bpf-log=\"true\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --enable-bypass=\"false\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --enable-mda=\"false\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --enable-secret-manager=\"false\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --help=\"false\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --mode=\"ads\"" subsys=manager
time="2024-07-11T02:33:10Z" level=info msg="FLAG: --plugin-cni-chained=\"true\"" subsys=manager
time="2024-07-11T02:33:13Z" level=info msg="bpf Start successful" subsys=manager
time="2024-07-11T02:33:13Z" level=info msg="start kmesh manage controller successfully" subsys=controller
time="2024-07-11T02:33:13Z" level=info msg="service node sidecar~10.244.1.4~kmesh-fwm7k.kmesh-system~kmesh-system.svc.cluster.local connect to discovery address istiod.istio-system.svc:15012" subsys=controller/config
time="2024-07-11T02:33:13Z" level=info msg="controller Start successful" subsys=manager
time="2024-07-11T02:33:13Z" level=info msg="start write CNI config\n" subsys="cni installer"
time="2024-07-11T02:33:13Z" level=info msg="kmesh cni use chained\n" subsys="cni installer"
time="2024-07-11T02:33:13Z" level=info msg="Copied /usr/bin/kmesh-cni to /opt/cni/bin." subsys="cni installer"
time="2024-07-11T02:33:13Z" level=info msg="kubeconfig either does not exist or is out of date, writing a new one" subsys="cni installer"
time="2024-07-11T02:33:13Z" level=info msg="wrote kubeconfig file /etc/cni/net.d/kmesh-cni-kubeconfig" subsys="cni installer"
time="2024-07-11T02:33:13Z" level=info msg="cni config file: /etc/cni/net.d/10-kindnet.conflist" subsys="cni installer"
time="2024-07-11T02:33:13Z" level=info msg="command Start cni successful" subsys=manager

time="2024-07-11T02:58:59Z" level=info msg="[KMESH] DEBUG: bpf find listener addr=[10.96.42.28:9080]\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] ERR: bpf set sockopt failed! ret:0\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to get cla endpoints ptrs\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to reflush cluster(PassthroughCluster) endpoints\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] DEBUG: bpf find listener addr=[10.96.7.138:9080]\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] ERR: bpf set sockopt failed! ret:0\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to get cla endpoints ptrs\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to reflush cluster(PassthroughCluster) endpoints\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] DEBUG: bpf find listener addr=[10.96.57.127:9080]\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] ERR: bpf set sockopt failed! ret:0\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to get cla endpoints ptrs\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to reflush cluster(PassthroughCluster) endpoints\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] DEBUG: bpf find listener addr=[10.96.57.127:9080]\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[KMESH] ERR: bpf set sockopt failed! ret:0\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to get cla endpoints ptrs\n" subsys=ebpf
time="2024-07-11T02:58:59Z" level=info msg="[CLUSTER] ERR: failed to reflush cluster(PassthroughCluster) endpoints\n" subsys=ebpf

I find in route that domains, the domain matching was incorrect. Unable to match the correct host For example, host in HTTP: productpage:9080 And domains in ads config:

          {
            "name": "productpage.default.svc.cluster.local:9080",
            "domains": [
              "productpage.default.svc.cluster.local",
              "productpage.default",
              "productpage.default.svc",
              "10.96.81.230"
            ],
            "routes": [
              {
                "name": "default",
                "match": {
                  "prefix": "/"
                },
                "route": {
                  "cluster": "outbound|9080||productpage.default.svc.cluster.local",
                  "retryPolicy": {
                    "numRetries": 2
                  }
                }
              }
            ]
          },

Kmesh cannot auto obtain and complete ns information What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

hzxuzhonghu commented 2 months ago

/label area/ads

kmesh-bot commented 2 months ago

@hzxuzhonghu: The label(s) /label area/ads cannot be applied. These labels are supported: tide/merge-method-merge, tide/merge-method-rebase, tide/merge-method-squash. Is this label configured under labels -> additional_labels or labels -> restricted_labels in plugin.yaml?

In response to [this](https://github.com/kmesh-net/kmesh/issues/553#issuecomment-2221937607): >/label area/ads Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
hzxuzhonghu commented 2 months ago

/label area/ads

Okabe-Rintarou-0 commented 2 months ago

kmesh in ads mode:

Okabe-Rintarou-0 commented 2 months ago

https://github.com/kmesh-net/kmesh/blob/064614bf3241e55312cf9d02c681fbf216bdddbd/bpf/kmesh/ads/include/route_config.h#L30-L75

For example: Here, ptr is httpbin, domain is httpbin.default.svc.local.cluster. We call bpf_strnstr:

bpf_strnstr(ptr, domain, ptr_length)

it will find httpbin.default.svc.local.cluster in httpbin, must fail.

hzxuzhonghu commented 2 months ago

I am thhinking it is because kmesh works as a node level proxy, There can be multi httpin services reside in different namespaces. So if a request pass through kmesh, it wil not know where to redirect it to only by the httpbin host name