gocrane / crane

Crane is a FinOps Platform for Cloud Resource Analytics and Economics in Kubernetes clusters. The goal is not only to help users to manage cloud cost easier but also ensure the quality of applications.
https://gocrane.io
Apache License 2.0
1.83k stars 377 forks source link

crane-agent can not connect the existing runtime endpint when using default runtime endpoint #857

Closed xrmzju closed 10 months ago

xrmzju commented 11 months ago

Describe the bug in current logic, crane-agent will connect the given runtimeEndpoint first if --runtime-endpoint args was given

       var runtimeEndpoints []string
    if runtimeEndpoint != "" {
        runtimeEndpoints = append(runtimeEndpoints, runtimeEndpoint)
    }
    runtimeEndpoints = append(runtimeEndpoints, defaultRuntimeEndpoints...)

but in NewRemoteRuntimeService func, it will dial the socket in none-block mode, which will return success immediately even the runtimeEnpoint not exist

     for _, endpoint := range runtimeEndpoints {
        containerRuntime, err := criremote.NewRemoteRuntimeService(endpoint, 3*time.Second)
        if err == nil {
            return containerRuntime, nil
        }
        errs = append(errs, err)
    }

Reproduce steps

  1. set the crane-agent with none-existing runtime endpoint
    --runtime-endpoint=unix:///var/run/none.sock
  2. crane-agent panic with a bad runtimeService conn Expected behavior
  3. connect with the given runtime endpoint only
  4. if no runtime endpoint given, try the default runtime-endpoint list one by one until found a real success one(by dial in block mode) Screenshots

Environment (please complete the following information):