cloud-barista / cb-spider

CB-Spider offers a unified view and interface for multi-cloud management.
https://github.com/cloud-barista/cb-spider/wiki
Apache License 2.0
32 stars 48 forks source link

[PMKS:Azure] When creating a cluster, it occurs panic (invalid memory address or nil pointer dereference) #1170

Closed sykim-etri closed 2 months ago

sykim-etri commented 6 months ago

Azure (KoreaCentral 리전)에서 클러스터 생성 시도시 nil pointer 접근 에러가 발생하면서 클러스터 생성을 실패합니다.

https://github.com/cloud-barista/cb-spider/blob/c9c6da3ece00615367db3b50572fd216f0e53b2d/cloud-control-manager/cloud-driver/drivers/azure/resources/ClusterHandler.go#L1771-L1778

디버깅을 위해 recover() 코드 삽입으로 에러 라인은 조금 차이가 있을 수 있습니다.

(network.SecurityRule) {                                                                            
 Response: (autorest.Response) {                                                                   
  Response: (*http.Response)(<nil>)                                                                
 },                                              
 SecurityRulePropertiesFormat: (*network.SecurityRulePropertiesFormat)(0xc001522a00)({                                                                                                                 
  Description: (*string)(<nil>),                                                                   
  Protocol: (network.SecurityRuleProtocol) (len=3) "Tcp",                                                                                                                                              
  SourcePortRange: (*string)(0xc00146bf80)((len=1) "*"),                                                                                                                                               
  DestinationPortRange: (*string)(<nil>),                                                          
  SourceAddressPrefix: (*string)(0xc00146bf90)((len=8) "Internet"),                                                                                                                                    
  SourceAddressPrefixes: (*[]string)(0xc00157c348)({                                                                                                                                                   
  }),                                            
  SourceApplicationSecurityGroups: (*[]network.ApplicationSecurityGroup)(<nil>),                                                                                                                       
  DestinationAddressPrefix: (*string)(<nil>),                                                      
  DestinationAddressPrefixes: (*[]string)(0xc00157c378)((len=1 cap=1) {                                                                                                                                
   (string) (len=12) "4.230.147.59"                                                                
  }),                                            
  DestinationApplicationSecurityGroups: (*[]network.ApplicationSecurityGroup)(<nil>),                                                                                                                  
  SourcePortRanges: (*[]string)(0xc00157c300)({                                                    
  }),                                            
  DestinationPortRanges: (*[]string)(0xc00157c330)((len=2 cap=2) {                                                                                                                                     
   (string) (len=3) "443",                                                                         
   (string) (len=2) "80"                         
  }),                                            
  Access: (network.SecurityRuleAccess) (len=5) "Allow",                                                                                                                                                
  Priority: (*int32)(0xc00228f780)(500),                                                           
  Direction: (network.SecurityRuleDirection) (len=7) "Inbound",                                                                                                                                        
  ProvisioningState: (network.ProvisioningState) (len=9) "Succeeded"                                                                                                                                   
 }),                                             
 Name: (*string)(0xc00146bfc0)((len=56) "k8s-azure-lb_allow_IPv4_556f7044ec033071ec0dfcf7cd85bc93"),                                                                                                   
 Etag: (*string)(0xc00146bf50)((len=40) "W/\"f3cc4c80-6a4a-4df1-8a4b-8b34c1e9d010\""),                                                                                                                 
 Type: (*string)(0xc00146bf60)((len=53) "Microsoft.Network/networkSecurityGroups/securityRules"),                                                                                                      
 ID: (*string)(0xc00146bf40)((len=275) "/subscriptions/a20fed83-96bd-4480-92a9-140b8e3b7c3a/resourceGroups/cb_koreacentral_ns01-tb111-cogbd6kcpuq5abkfc380_koreacentral/providers/Microsoft.Network/net
workSecurityGroups/aks-agentpool-26245572-nsg/securityRules/k8s-azure-lb_allow_IPv4_556f7044ec033071ec0dfcf7cd85bc93")
}                                                

[CB-SPIDER].[ERROR]: 2024-04-18 05:39:35 ClusterHandler.go:55, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources.(*AzureClusterHandler).CreateCluster.func1
() - PANIC!!                                                                                                                                                                                           
runtime error: invalid memory address or nil pointer dereference                                                                                                                                       
goroutine 370 [running]:                         
runtime/debug.Stack()                            
        /usr/local/go/src/runtime/debug/stack.go:24 +0x5e                                                                                                                                              
github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources.(*AzureClusterHandler).CreateCluster.func1()
        /home/sykim/go/src/github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources/ClusterHandler.go:54 +0x38
panic({0x53b75a0?, 0xa8be810?})                                                                    
        /usr/local/go/src/runtime/panic.go:914 +0x21f                                                                                                                                                  
github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources.waitingClusterBaseSecurityGroup({{_, _}, {_, _}}, _, _, {_, _}, {{0xc000ab4630, 0x24}, ...}, ...)
        /home/sykim/go/src/github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources/ClusterHandler.go:1783 +0x883
github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources.(*AzureClusterHandler).CreateCluster(_, {{{0xc0009e20c0, 0x1f}, {0x0, 0x0}}, {0xc000a80f0a, 0x6}, {{{0xc0
00aec380, 0x2f}, {0xc0002a9130, ...}}, ...}, ...})
        /home/sykim/go/src/github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/azure/resources/ClusterHandler.go:73 +0x58f
github.com/cloud-barista/cb-spider/api-runtime/common-runtime.CreateCluster({_, _}, {_, _}, {{{0xc0009e20c0, 0x1f}, {0x0, 0x0}}, {0xc000a80f0a, 0x6}, ...})
        /home/sykim/go/src/github.com/cloud-barista/cb-spider/api-runtime/common-runtime/ClusterManager.go:479 +0x19bd
github.com/cloud-barista/cb-spider/api-runtime/rest-runtime.CreateCluster({0x6b91200, 0xc0006400a0})                                                                                                   
        /home/sykim/go/src/github.com/cloud-barista/cb-spider/api-runtime/rest-runtime/ClusterRest.go:171 +0x7ca
github.com/labstack/echo/v4.(*Echo).add.func1({0x6b91200, 0xc0006400a0})                                                                                                                               
        /home/sykim/go/pkg/mod/github.com/labstack/echo/v4@v4.9.0/echo.go:536 +0x4b                                                                                                                    
github.com/cloud-barista/cb-spider/api-runtime/rest-runtime.ApiServer.Recover.RecoverWithConfig.func2.1({0x6b91200?, 0xc0006400a0})
        /home/sykim/go/pkg/mod/github.com/labstack/echo/v4@v4.9.0/middleware/recover.go:119 +0xf3                                                                                                      
github.com/labstack/echo/v4/middleware.LoggerWithConfig.func2.1({0x6b91200?, 0xc0006400a0})                                                                                                            
        /home/sykim/go/pkg/mod/github.com/labstack/echo/v4@v4.9.0/middleware/logger.go:119 +0xd2                                                                                                       
github.com/labstack/echo/v4/middleware.CORSWithConfig.func1.1({0x6b91200, 0xc0006400a0})                                                                                                               
        /home/sykim/go/pkg/mod/github.com/labstack/echo/v4@v4.9.0/middleware/cors.go:142 +0x463                                                                                                        
github.com/labstack/echo/v4.(*Echo).ServeHTTP(0xc001498000, {0x6ae5408?, 0xc00158e540}, 0xc000ae8200)
        /home/sykim/go/pkg/mod/github.com/labstack/echo/v4@v4.9.0/echo.go:646 +0x399                                                                                                                   
net/http.serverHandler.ServeHTTP({0x6ad5f30?}, {0x6ae5408?, 0xc00158e540?}, 0x6?)                                                                                                                      
        /usr/local/go/src/net/http/server.go:2938 +0x8e                                                                                                                                                
net/http.(*conn).serve(0xc00048d560, {0x6af3cf8, 0xc0014ce090})                                                                                                                                        
        /usr/local/go/src/net/http/server.go:2009 +0x5f4                                                                                                                                               
created by net/http.(*Server).Serve in goroutine 291                                                                                                                                                   
        /usr/local/go/src/net/http/server.go:3086 +0x5cb

상기 로그에 따르면 securityGroupsClient.Get()의 결과값을 제공하는 방식(DestinationPortRange -> DestinationPortRanges)이 변경된 것 같아 보입니다.

Environment

sykim-etri commented 6 months ago

간단하게 아래와 같이 에러를 회피하는 수준에서는 수정할 수 있겠지만 정확한 의미를 파악한 후 개선하는데는 다소 시간이 소요될 것으로 생각됩니다. 혹시 개발하신 분께서 리뷰해주시면 도움이 될 것으로 생각됩니다.

@@ -1769,15 +1778,23 @@ func waitingClusterBaseSecurityGroup(createdClusterIID irs.IID, managedClustersC
                sg, err := securityGroupsClient.Get(ctx, clusterManagedResourceGroup, *baseSecurityGroup.Name, "")
                if err == nil {
                        for _, rule := range *sg.SecurityRules {
-                               if *rule.Priority == 500 && *rule.DestinationPortRange == "80" {
-                                       baseRuleCheck++
-                               }
-                               if *rule.Priority == 501 && *rule.DestinationPortRange == "443" {
-                                       baseRuleCheck++
+                               if *rule.Priority == 500 || *rule.Priority == 501 {
+                                       if rule.DestinationPortRange != nil &&
+                                               (*rule.DestinationPortRange == "80" || *rule.DestinationPortRange == "443") {
+                                               baseRuleCheck++
+                                       } else {
+                                               for _, portRange := range *rule.DestinationPortRanges {
+                                                       if portRange == "80" {
+                                                               baseRuleCheck++
+                                                       } else if portRange == "443" {
+                                                               baseRuleCheck++
+                                                       }
+                                               }
+                                       }
                                }
                        }
                }
-               if baseRuleCheck == 2 {
+               if baseRuleCheck >= 2 {
                        break
                }
                apiCallCount++
powerkimhub commented 6 months ago

@ish-hcc (cc: @sykim-etri )


[오후 7:18:19] curl -sX POST http://localhost:1024/spider/cluster -H 'Content-Type: application/json' -d '{ "ConnectionName" : "azure-northeu-config", "ReqInfo" : {"Name" : "spider-cluster-01","Version" : "1.28.5", "VPCName" : "vpc-01", "SubnetNames" : ["subnet-01"], "SecurityGroupNames" : ["sg-01"], "NodeGroupList": [ {
        "Name" :            "economy", 
        "ImageName" :       "", 
        "VMSpecName" :      "Standard_D8ds_v5",   <===============
        "RootDiskType" :    "", 
        "RootDiskSize" :    "60", 
        "KeyPairName" :     "keypair-01",
        "OnAutoScaling" :   "true", 
        "DesiredNodeSize" : "2", 
        "MinNodeSize" :     "1", 
        "MaxNodeSize" :     "3"
}
                         ]                                  }                               }'
[오후 7:26:13]    ==> {"message":"Failed to Create Cluster. err = containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=500 -- Original Error: Code=\"InternalOperationError\" Message=\"Internal server error\""}
ish-hcc commented 4 months ago

PR created: https://github.com/cloud-barista/cb-spider/pull/1213

powerkimhub commented 4 months ago

@sykim-etri

ish-hcc commented 2 months ago

1259