akash-network / support

Akash Support and Issue Tracking
Apache License 2.0
5 stars 4 forks source link

`kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))` on `send-manifest` (on tx deployment update) #152

Open andy108369 opened 10 months ago

andy108369 commented 10 months ago

provider-services 0.4.8 (provider & client [CLI]) akash network 0.28.2

I am still seeing this error (err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))") on Hurricane provider with k8s v1.27.5 (delivered with kubespray v2.23.0) when sending-manifest to the provider (using the CLI) and that's not limited to the image update in SDL, but also env update.

It is not always happening, but rather sporadically.

Todo

Provider logs:

D[2023-11-22|16:45:37.103] running check                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor attempt=1
I[2023-11-22|16:45:37.135] check result                                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor ok=true attempt=1
I[2023-11-22|16:45:47.516] update received                              module=provider-manifest cmp=provider deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605 version=C761DDE12EAAD74D36ACD78EB57DFF035836BD85C162E9B1A071B38313D57BEE
D[2023-11-22|16:45:48.377] running check                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor attempt=1
I[2023-11-22|16:45:48.403] check result                                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor ok=true attempt=1
I[2023-11-22|16:45:54.207] manifest received                            module=manifest-manager cmp=provider deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605
I[2023-11-22|16:45:54.210] data received                                module=manifest-manager cmp=provider deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605 version=c761dde12eaad74d36acd78eb57dff035836bd85c162e9b1a071b38313d57bee
D[2023-11-22|16:45:54.210] requests valid                               module=manifest-manager cmp=provider deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605 num-requests=1
D[2023-11-22|16:45:54.210] publishing manifest received                 module=manifest-manager cmp=provider deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605 num-leases=1
D[2023-11-22|16:45:54.210] publishing manifest received for lease       module=manifest-manager cmp=provider deployment=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605 lease_id=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk
I[2023-11-22|16:45:54.210] manifest received                            module=provider-cluster cmp=provider cmp=service lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk
D[2023-11-22|16:45:54.211] shutting down                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor
D[2023-11-22|16:45:54.211] shutdown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor
I[2023-11-22|16:45:54.219] hostnames withheld                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cnt=0
E[2023-11-22|16:45:54.219] deploying workload                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
E[2023-11-22|16:45:54.219] execution error                              module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud state=deploy-active err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
D[2023-11-22|16:45:54.232] purged ips                                   module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2023-11-22|16:45:54.248] purged hostnames                             module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2023-11-22|16:45:54.248] teardown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2023-11-22|16:45:54.248] shutting down                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2023-11-22|16:45:54.248] waiting on dm.wg                             module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
I[2023-11-22|16:45:54.248] shutdown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2023-11-22|16:45:54.248] hostnames released                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2023-11-22|16:45:54.248] sending manager into channel                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
I[2023-11-22|16:45:54.248] manager done                                 module=provider-cluster cmp=provider cmp=service lease=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk
D[2023-11-22|16:45:54.248] unreserving capacity                         module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1
I[2023-11-22|16:45:54.248] attempting to removing reservation           module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1
I[2023-11-22|16:45:54.248] removing reservation                         module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1
I[2023-11-22|16:45:54.248] unreserve capacity complete                  module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1h24fljt7p0nh82cq0za0uhsct3sfwsfu9w3c9h/13697605/1/1
andy108369 commented 6 months ago

still happens with provider 0.5.4, on akash network 0.32.2; have only observed this to happen on the Hurricane provider.

It feels like this issue triggers when provider scans through the leases running check / check result (which is quite constantly happening at high pace on the Hurricane when I look at the provider logs) , and if there is not enough delay between tx update deloyment and send-manifest.

andy108369 commented 1 month ago

still happens with provider 0.6.2, on akash network 0.36.0

example with 17438710 dseq, kube-builder just errored with ClusterParams() returned result of unexpected type (%!s(<nil>)) upon updating the SDL.

provider logs 152-hurricane.log

$ cat /tmp/152-hurricane.log | grep -Ev 'operator=ip|running check|check result|below target' | grep 17438710
I[2024-08-13|16:23:13.197] update received                              module=provider-manifest cmp=provider deployment=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710 version=7C21B33A56D24DDBDFF34960DF02751567DE89C89EEDF01D9B95A26642879BE1
I[2024-08-13|16:23:22.264] manifest received                            module=manifest-manager cmp=provider deployment=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710
I[2024-08-13|16:23:22.266] data received                                module=manifest-manager cmp=provider deployment=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710 version=7c21b33a56d24ddbdff34960df02751567de89c89eedf01d9b95a26642879be1
D[2024-08-13|16:23:22.267] requests valid                               module=manifest-manager cmp=provider deployment=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710 num-requests=1
D[2024-08-13|16:23:22.267] publishing manifest received                 module=manifest-manager cmp=provider deployment=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710 num-leases=1
D[2024-08-13|16:23:22.267] publishing manifest received for lease       module=manifest-manager cmp=provider deployment=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710 lease_id=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk
I[2024-08-13|16:23:22.267] manifest received                            module=provider-cluster cmp=provider cmp=service lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk
D[2024-08-13|16:23:22.267] shutting down                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor
D[2024-08-13|16:23:22.267] shutdown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cmp=deployment-monitor
I[2024-08-13|16:23:22.272] hostnames withheld                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud cnt=0
E[2024-08-13|16:23:22.272] deploying workload                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
E[2024-08-13|16:23:22.272] execution error                              module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud state=deploy-active err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
D[2024-08-13|16:23:22.276] purged ips                                   module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2024-08-13|16:23:22.297] purged hostnames                             module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2024-08-13|16:23:22.297] teardown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2024-08-13|16:23:22.297] shutting down                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2024-08-13|16:23:22.297] waiting on dm.wg                             module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
I[2024-08-13|16:23:22.297] shutdown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2024-08-13|16:23:22.297] hostnames released                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
D[2024-08-13|16:23:22.297] sending manager into channel                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk manifest-group=dcloud
I[2024-08-13|16:23:22.297] manager done                                 module=provider-cluster cmp=provider cmp=service lease=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1/akash15tl6v6gd0nte0syyxnv57zmmspgju4c3xfmdhk
D[2024-08-13|16:23:22.297] unreserving capacity                         module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1
I[2024-08-13|16:23:22.297] attempting to removing reservation           module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1
I[2024-08-13|16:23:22.297] removing reservation                         module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1
I[2024-08-13|16:23:22.297] unreserve capacity complete                  module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash1qh0f0h7jlq4x5gpxghrxvps5l09y7uuvcumcyd/17438710/1/1
andy108369 commented 1 month ago

the issue is still present in provider 0.6.4 additional logs stored under node2.hurricane.akash.pub:/root/issue-152-logs dir.

image

andy108369 commented 1 month ago

Spotted the same issue on Valdi provider for dseqs 17676873 and 17687779.

Complete provider logs saved under root@node2.h100.wdc.val.akash.pub:/root/provider-logs-issue-152 dir.

D[2024-08-21|21:22:31.786] running check                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cmp=deployment-monitor attempt=1
I[2024-08-21|21:22:31.807] check result                                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cmp=deployment-monitor ok=true attempt=1
I[2024-08-21|21:22:41.874] update received                              module=provider-manifest cmp=provider deployment=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873 version=1824113459BC475B447403E58AE0CBF45DB47A89C5E6E295A7F2C27FE3679D56
D[2024-08-21|21:22:43.433] running check                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cmp=deployment-monitor attempt=1
I[2024-08-21|21:22:43.453] check result                                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cmp=deployment-monitor ok=true attempt=1
I[2024-08-21|21:22:50.428] manifest received                            module=manifest-manager cmp=provider deployment=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873
I[2024-08-21|21:22:50.433] data received                                module=manifest-manager cmp=provider deployment=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873 version=1824113459bc475b447403e58ae0cbf45db47a89c5e6e295a7f2c27fe3679d56
D[2024-08-21|21:22:50.434] requests valid                               module=manifest-manager cmp=provider deployment=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873 num-requests=1
D[2024-08-21|21:22:50.434] publishing manifest received                 module=manifest-manager cmp=provider deployment=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873 num-leases=1
D[2024-08-21|21:22:50.434] publishing manifest received for lease       module=manifest-manager cmp=provider deployment=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873 lease_id=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8
I[2024-08-21|21:22:50.434] manifest received                            module=provider-cluster cmp=provider cmp=service lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8
D[2024-08-21|21:22:50.435] shutting down                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cmp=deployment-monitor
D[2024-08-21|21:22:50.435] shutdown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cmp=deployment-monitor
I[2024-08-21|21:22:50.441] hostnames withheld                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud cnt=0
E[2024-08-21|21:22:50.441] deploying workload                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
E[2024-08-21|21:22:50.441] execution error                              module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud state=deploy-active err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
D[2024-08-21|21:22:50.445] purged ips                                   module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
D[2024-08-21|21:22:50.452] purged hostnames                             module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
D[2024-08-21|21:22:50.453] teardown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
D[2024-08-21|21:22:50.453] shutting down                                module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
D[2024-08-21|21:22:50.453] waiting on dm.wg                             module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
I[2024-08-21|21:22:50.453] shutdown complete                            module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
D[2024-08-21|21:22:50.453] hostnames released                           module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
D[2024-08-21|21:22:50.453] sending manager into channel                 module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud
I[2024-08-21|21:22:50.453] manager done                                 module=provider-cluster cmp=provider cmp=service lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8
D[2024-08-21|21:22:50.453] unreserving capacity                         module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1
I[2024-08-21|21:22:50.453] attempting to removing reservation           module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1
I[2024-08-21|21:22:50.453] removing reservation                         module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1
I[2024-08-21|21:22:50.453] unreserve capacity complete                  module=provider-cluster cmp=provider cmp=service cmp=inventory-service order=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17676873/1/1
E[2024-08-21|21:17:53.841] execution error                              module=provider-cluster cmp=provider cmp=service cmp=deployment-manager lease=akash19jqc8tsdtzvm2zd4mcg0vx9fll4feegfduvpp8/17687779/1/1/akash19ah5c95kq4kz2g6q5rdkdgt80kc3xycsd8plq8 manifest-group=dcloud state=deploy-active err="kube-builder: ClusterParams() returned result of unexpected type (%!s(<nil>))"
andy108369 commented 1 month ago

Todo: Test provider v0.6.5-rc6

Provider v0.6.5-rc6 has some patches which try to fix this issue.

andy108369 commented 1 month ago

The first week has been pretty smooth with 0.6.5-rc6 on Hurricane provider! :rocket:

andy108369 commented 2 weeks ago

@troian let's release v0.6.5-rc6? It's been running well in the past three weeks on the Hurricane provider.

$ kubectl -n akash-services get pods -o custom-columns='NAME:.metadata.name,IMAGE:.spec.containers[*].image'
NAME                                                          IMAGE
akash-node-1-0                                                ghcr.io/akash-network/node:0.36.0
akash-provider-0                                              ghcr.io/akash-network/provider:0.6.5-rc6
operator-hostname-79fc5855bb-hk9bc                            ghcr.io/akash-network/provider:0.6.5-rc6
operator-inventory-7cdfdb65d7-msl6c                           ghcr.io/akash-network/provider:0.6.5-rc6
operator-inventory-hardware-discovery-control-01.hurricane2   ghcr.io/akash-network/provider:0.6.5-rc6
operator-inventory-hardware-discovery-worker-01.hurricane2    ghcr.io/akash-network/provider:0.6.5-rc6
operator-ip-796b49c77-k4xgh                                   ghcr.io/akash-network/provider:0.6.5-rc6

$ kubectl -n akash-services get pods -o wide
NAME                                                          READY   STATUS    RESTARTS      AGE   IP               NODE                    NOMINATED NODE   READINESS GATES
akash-node-1-0                                                1/1     Running   1 (44d ago)   44d   10.233.73.131    worker-01.hurricane2    <none>           <none>
akash-provider-0                                              1/1     Running   2 (9d ago)    24d   10.233.73.155    worker-01.hurricane2    <none>           <none>
operator-hostname-79fc5855bb-hk9bc                            1/1     Running   0             24d   10.233.73.161    worker-01.hurricane2    <none>           <none>
operator-inventory-7cdfdb65d7-msl6c                           1/1     Running   0             24d   10.233.73.144    worker-01.hurricane2    <none>           <none>
operator-inventory-hardware-discovery-control-01.hurricane2   1/1     Running   0             24d   10.233.117.178   control-01.hurricane2   <none>           <none>
operator-inventory-hardware-discovery-worker-01.hurricane2    1/1     Running   0             24d   10.233.73.179    worker-01.hurricane2    <none>           <none>
operator-ip-796b49c77-k4xgh                                   1/1     Running   0             24d   10.233.73.181    worker-01.hurricane2    <none>           <none>

$ kubectl -n akash-services logs akash-provider-0 |grep ClusterParams
Defaulted container "provider" out of: provider, init (init)
$ kubectl -n akash-services logs akash-provider-0 --previous |grep ClusterParams
Defaulted container "provider" out of: provider, init (init)