cloud-barista / cb-spider

CB-Spider offers a unified view and interface for multi-cloud management.
https://github.com/cloud-barista/cb-spider/wiki
Apache License 2.0
33 stars 47 forks source link

[GCP:Cluster] I got an error with timeout when adding a nodegroup, BUT a nodegroup is provisioned. #1334

Closed sykim-etri closed 1 month ago

sykim-etri commented 2 months ago

노드그룹 추가시 타임아웃이 짧아 관련 노드그룹 정보를 확보하지 못해 에러가 리턴되고 있습니다. 하지만 시간이 조금 더 흐른 뒤 노드그룹이 정상적으로 생성된 것이 확인됩니다.

me-central1, me-central2, me-west1 리전에서 거의 발생하는 상황이며, 타임아웃 시간을 60초로 변경시 실패 확률이 줄어드는 것을 확인하였습니다.

노드그룹 생성 결과를 확인하는 방식을 개선할 필요가 있어 보입니다.

에러 발생 당시 CB-SP의 로그는 다음과 같습니다.

[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterRest.go:356, github.com/cloud-barista/cb-spider/api-runtime/rest-runtime.AddNodeGroup() - call AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterManager.go:831, github.com/cloud-barista/cb-spider/api-runtime/common-runtime.AddNodeGroup() - call AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 RegionInfoManager.go:114, github.com/cloud-barista/cb-spider/cloud-info-manager/region-info-manager.GetRegion() - call GetRegion()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 DriverInfoManager.go:117, github.com/cloud-barista/cb-spider/cloud-info-manager/driver-info-manager.GetCloudDriver() - call GetCloudDriver()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CloudDriverHandler_static.go:49, github.com/cloud-barista/cb-spider/cloud-control-manager.getCloudDriver() - CloudDriverHandler: called getStaticCloudDriver() - gcp-driver-v1.0.so
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CredentialInfoManager.go:221, github.com/cloud-barista/cb-spider/cloud-info-manager/credential-info-manager.GetCredentialDecrypt() - call GetCredential()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 RegionInfoManager.go:114, github.com/cloud-barista/cb-spider/cloud-info-manager/region-info-manager.GetRegion() - call GetRegion()
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:83, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getVMClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:84, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getVMClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:85, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getVMClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:91, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getContainerClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:92, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getContainerClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:93, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getContainerClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:99, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getBillingCatalogClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:100, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getBillingCatalogClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:101, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getBillingCatalogClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:107, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getCostEstimationClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:108, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getCostEstimationClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:109, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getCostEstimationClient ##################
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 GCP_CloudConnection.go:140, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/connect.(*GCPCloudConnection).CreateClusterHandler() - GCP Cloud Driver: called CreateClusterHandler()!
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterHandler.go:427, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - GCP Cloud Driver: called AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterHandler.go:338, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).GetCluster() - GCP Cloud Driver: called GetCluster()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP GetCluster()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:883, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.mappingClusterInfo() - metaSecurityGroupTags : [crmgpbkcpuqas6a93apg-crmgpbkcpuq064vnth20]
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:998, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.mappingClusterInfo() - Cluster status : RUNNINGActive
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1216, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertCluster() - nodeGroupList [{{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}]
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1233, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - convertNodeGroup {{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1237, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - keyValue {InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1239, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - HasPrefix
[CB-SPIDER].[INFO]: 2024-09-20 15:12:10 ClusterHandler.go:1246, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - instanceList [gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12]
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:1258, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - nodeList [{gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12}]
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:12 ClusterHandler.go:1261, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - {{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [{gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12}] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:1237, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - keyValue {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:503, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - parent : projects/sykim-etri-prj/locations/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:28 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:28 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:34 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:34 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:39 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:39 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:45 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:44 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:50 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:50 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:57 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:57 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[INFO]: 2024-09-20 15:13:02 CommonHandler.go:560, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - Forcing termination of Wait because the status of resource [operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af] has not failed within [30] seconds.
[CB-SPIDER].[INFO]: 2024-09-20 15:13:02 ClusterHandler.go:1190, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.getNodePools() - GCP Cloud Driver: called getNodePools() projects/sykim-etri-prj/locations/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g
[CB-SPIDER].[INFO]: 2024-09-20 15:13:02 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP getNodePools()
[CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterHandler.go:1198, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.getNodePools() - Failed to getNodePools :  googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0x39c340060cab03d4"
  }
]
, notFound
[CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterHandler.go:528, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - Failed to getNodePools :  googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0x39c340060cab03d4"
  }
]
, notFound
[CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterManager.go:943, github.com/cloud-barista/cb-spider/api-runtime/common-runtime.AddNodeGroup() - Failed to getNodePools :  googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0x39c340060cab03d4"
  }
]
, notFound
powerkimhub commented 2 months ago

@hippo-an (@sykim-etri)

powerkimhub commented 1 month ago

@hippo-an (@sykim-etri)