Closed sykim-etri closed 1 month ago
노드그룹 추가시 타임아웃이 짧아 관련 노드그룹 정보를 확보하지 못해 에러가 리턴되고 있습니다. 하지만 시간이 조금 더 흐른 뒤 노드그룹이 정상적으로 생성된 것이 확인됩니다.
me-central1, me-central2, me-west1 리전에서 거의 발생하는 상황이며, 타임아웃 시간을 60초로 변경시 실패 확률이 줄어드는 것을 확인하였습니다.
노드그룹 생성 결과를 확인하는 방식을 개선할 필요가 있어 보입니다.
에러 발생 당시 CB-SP의 로그는 다음과 같습니다.
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterRest.go:356, github.com/cloud-barista/cb-spider/api-runtime/rest-runtime.AddNodeGroup() - call AddNodeGroup() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterManager.go:831, github.com/cloud-barista/cb-spider/api-runtime/common-runtime.AddNodeGroup() - call AddNodeGroup() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 RegionInfoManager.go:114, github.com/cloud-barista/cb-spider/cloud-info-manager/region-info-manager.GetRegion() - call GetRegion() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 DriverInfoManager.go:117, github.com/cloud-barista/cb-spider/cloud-info-manager/driver-info-manager.GetCloudDriver() - call GetCloudDriver() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CloudDriverHandler_static.go:49, github.com/cloud-barista/cb-spider/cloud-control-manager.getCloudDriver() - CloudDriverHandler: called getStaticCloudDriver() - gcp-driver-v1.0.so [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CredentialInfoManager.go:221, github.com/cloud-barista/cb-spider/cloud-info-manager/credential-info-manager.GetCredentialDecrypt() - call GetCredential() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 RegionInfoManager.go:114, github.com/cloud-barista/cb-spider/cloud-info-manager/region-info-manager.GetRegion() - call GetRegion() [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:83, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getVMClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:84, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getVMClient [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:85, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getVMClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:91, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getContainerClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:92, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getContainerClient [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:93, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getContainerClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:99, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getBillingCatalogClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:100, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getBillingCatalogClient [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:101, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getBillingCatalogClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:107, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getCostEstimationClient ################## [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:108, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getCostEstimationClient [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:109, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getCostEstimationClient ################## [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 GCP_CloudConnection.go:140, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/connect.(*GCPCloudConnection).CreateClusterHandler() - GCP Cloud Driver: called CreateClusterHandler()! [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterHandler.go:427, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - GCP Cloud Driver: called AddNodeGroup() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterHandler.go:338, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).GetCluster() - GCP Cloud Driver: called GetCluster() [CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP GetCluster() [CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:883, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.mappingClusterInfo() - metaSecurityGroupTags : [crmgpbkcpuqas6a93apg-crmgpbkcpuq064vnth20] [CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:998, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.mappingClusterInfo() - Cluster status : RUNNINGActive [CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1216, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertCluster() - nodeGroupList [{{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}] [CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1233, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - convertNodeGroup {{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]} [CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1237, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - keyValue {InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} [CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1239, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - HasPrefix [CB-SPIDER].[INFO]: 2024-09-20 15:12:10 ClusterHandler.go:1246, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - instanceList [gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12] [CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:1258, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - nodeList [{gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12}] [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:12 ClusterHandler.go:1261, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - {{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [{gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12}] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]} [CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:1237, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - keyValue {keypair crmgpr4cp-crmgpr4cpuq064vnth2g} [CB-SPIDER].[INFO]: 2024-09-20 15:12:12 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP AddNodeGroup() [CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:503, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - parent : projects/sykim-etri-prj/locations/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30 [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:28 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[] <nil> operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:28 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []} [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:34 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[] <nil> operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:34 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []} [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:39 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[] <nil> operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:39 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []} [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:45 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[] <nil> operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:44 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []} [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:50 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[] <nil> operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:50 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []} [CB-SPIDER].[DEBUG]: 2024-09-20 15:12:57 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[] <nil> operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:57 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []} [CB-SPIDER].[INFO]: 2024-09-20 15:13:02 CommonHandler.go:560, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - Forcing termination of Wait because the status of resource [operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af] has not failed within [30] seconds. [CB-SPIDER].[INFO]: 2024-09-20 15:13:02 ClusterHandler.go:1190, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.getNodePools() - GCP Cloud Driver: called getNodePools() projects/sykim-etri-prj/locations/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g [CB-SPIDER].[INFO]: 2024-09-20 15:13:02 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP getNodePools() [CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterHandler.go:1198, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.getNodePools() - Failed to getNodePools : googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found. Details: [ { "@type": "type.googleapis.com/google.rpc.RequestInfo", "requestId": "0x39c340060cab03d4" } ] , notFound [CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterHandler.go:528, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - Failed to getNodePools : googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found. Details: [ { "@type": "type.googleapis.com/google.rpc.RequestInfo", "requestId": "0x39c340060cab03d4" } ] , notFound [CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterManager.go:943, github.com/cloud-barista/cb-spider/api-runtime/common-runtime.AddNodeGroup() - Failed to getNodePools : googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found. Details: [ { "@type": "type.googleapis.com/google.rpc.RequestInfo", "requestId": "0x39c340060cab03d4" } ] , notFound
@hippo-an (@sykim-etri)
노드그룹 추가시 타임아웃이 짧아 관련 노드그룹 정보를 확보하지 못해 에러가 리턴되고 있습니다. 하지만 시간이 조금 더 흐른 뒤 노드그룹이 정상적으로 생성된 것이 확인됩니다.
me-central1, me-central2, me-west1 리전에서 거의 발생하는 상황이며, 타임아웃 시간을 60초로 변경시 실패 확률이 줄어드는 것을 확인하였습니다.
노드그룹 생성 결과를 확인하는 방식을 개선할 필요가 있어 보입니다.
에러 발생 당시 CB-SP의 로그는 다음과 같습니다.