alibaba / higress

Cloud Native API Gateway | 云原生API网关
https://higress.io
Apache License 2.0
2.5k stars 407 forks source link

macos 13.6下kind部署使用wasm插件后envoy奔溃 #1025

Open FQHSLycopene opened 3 weeks ago

FQHSLycopene commented 3 weeks ago

系统:macos 13.6 intel docker:24.0.2 higress:1.4.0

复现过程: 依照快速开始在本地使用kind构建k8s,并安装higress 安装任意wasm插件后envoy崩溃 插件:

kind: WasmPlugin
metadata:
  name: request-block
  namespace: higress-system
spec:
  defaultConfig:
    block_urls:
    - swagger.html
    - foo=bar
    case_sensitive: false
  url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/request-block:1.0.0

容器日志:

[Envoy (Epoch 0)] [2024-06-03 06:10:07.678][46][critical][backtrace] Caught Segmentation fault, suspect faulting address 0x7fbc216dc000
[Envoy (Epoch 0)] [2024-06-03 06:10:07.678][46][critical][backtrace] Backtrace (use tools/stack_decode.py to get line numbers):
[Envoy (Epoch 0)] [2024-06-03 06:10:07.678][46][critical][backtrace] Envoy version: 4ad0eba4dd5f63b10260495f263ab3971326b4f5/1.20.0/Clean/RELEASE/BoringSSL
2024-06-03T06:10:07.684179Z error   Epoch 0 exited with error: signal: segmentation fault
2024-06-03T06:10:07.684243Z info    No more active epochs, terminating
johnlanni commented 3 weeks ago

我是macos 11.4 intel,没能复现这个问题。你试试安装 higress 1.3.6 可以吗,我看看是否是升级v8导致的兼容性问题。

FQHSLycopene commented 3 weeks ago

我是macos 11.4 intel,没能复现这个问题。你试试安装 higress 1.3.6 可以吗,我看看是否是升级v8导致的兼容性问题。

使用1.3.6,这个问题没有出现了

johnlanni commented 3 weeks ago

我是macos 11.4 intel,没能复现这个问题。你试试安装 higress 1.3.6 可以吗,我看看是否是升级v8导致的兼容性问题。

使用1.3.6,这个问题没有出现了

好的 我回退下v8版本看看

johnlanni commented 3 weeks ago

麻烦使用docker-compose启动一下,生成一下core文件,发我看看

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:v1.4.0
    entrypoint: /usr/local/bin/envoy
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./main.wasm:/etc/envoy/main.wasm
    # 设置ulimit -c为无限制以支持coredump
    ulimits:
      core: -1
  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"
networks:
  wasmtest: {}

envoy.yml:

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 10000
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          scheme_header_transformation:
            scheme_to_overwrite: https
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: httpbin
          http_filters:
          - name: wasmdemo
            typed_config:
              "@type": type.googleapis.com/udpa.type.v1.TypedStruct
              type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
              value:
                config:
                  name: wasmdemo
                  vm_config:
                    runtime: envoy.wasm.runtime.v8
                    code:
                      local:
                        filename: /etc/envoy/main.wasm
                  configuration:
                    "@type": "type.googleapis.com/google.protobuf.StringValue"
                    value: |
                      {
                        "mockEnable": false
                      }
          - name: envoy.filters.http.router
  clusters:
  - name: httpbin
    connect_timeout: 30s
    type: LOGICAL_DNS
    # Comment out the following line to test on v6 networks
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: httpbin
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: httpbin
                port_value: 80

main.wasm: main.wasm.zip

johnlanni commented 3 weeks ago

image 确实 crash 在 v8 内了,目前看 macos > 11.4 有影响,不影响生产常用的linux场景,后续跟进看下