vmware-archive / kops

Kubernetes Operations (kops) - Production Grade K8s Installation, Upgrades, and Management
Apache License 2.0
3 stars 3 forks source link

vCenter running out of HTTP sessions, cluster deployment failing. #18

Open abrarshivani opened 7 years ago

abrarshivani commented 7 years ago

While launching cluster I am getting following error,

I0329 00:52:07.789484   93214 vsphere_cloud.go:71] Creating vSphere Cloud with server(10.192.213.15), datacenter(VSAN-DC), cluster(VSAN-Cluster)
I0329 00:52:07.789495   93214 vsphere_cloud.go:85] Creating vSphere Cloud URL is https://10.192.213.15/sdk

error populating configuration: 503 Service Unavailable
prashima commented 7 years ago

@abrarshivani can you please assign appropriate priority based on the frequency of this error?

prashima commented 7 years ago

I haven't been able to reproduce this with latest code. Please try to reproduce and update logs.

abrarshivani commented 7 years ago

VC runs out of session.

This is from VC logs: 2017-04-04T23:44:57.843Z info vpxd[7F5B126CD700] [Originator@6876 sub=vpxLro opID=71a6d224] [VpxLRO] -- FINISH lro-678806 2017-04-04T23:44:57.864Z error vpxd[7F5B381E6700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000 2017-04-04T23:44:57.932Z error vpxd[7F5B384EC700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000 2017-04-04T23:44:57.938Z info vpxd[7F5B388F4700] [Originator@6876 sub=vpxLro opID=655e0792] [VpxLRO] -- BEGIN lro-678810 -- SearchIndex -- vim.SearchIndex.findByUuid -- 5271786a-5ff5-b6ba-0cd7-3c43f98c2e55(5287e02c-7cdf-c5a5-96d8-56501a9a0492) 2017-04-04T23:44:57.939Z info vpxd[7F5B388F4700] [Originator@6876 sub=vpxLro opID=655e0792] [VpxLRO] -- FINISH lro-678810 2017-04-04T23:45:00.081Z info vpxd[7F5B127CF700] [Originator@6876 sub=vpxLro opID=7f863d62] [VpxLRO] -- BEGIN lro-678811 -- ServiceInstance -- vim.ServiceInstance.retrieveContent -- 52f3a2e8-61ac-e3b1-c501-ef6b2c201a9a 2017-04-04T23:45:00.081Z info vpxd[7F5B127CF700] [Originator@6876 sub=vpxLro opID=7f863d62] [VpxLRO] -- FINISH lro-678811 2017-04-04T23:45:00.094Z info vpxd[7F5B122C5700] [Originator@6876 sub=vpxLro opID=46282e56] [VpxLRO] -- BEGIN lro-678812 -- SessionManager -- vim.SessionManager.login -- 528c2de1-a332-d171-9252-1e18943e24fd 2017-04-04T23:45:00.142Z info vpxd[7F5B122C5700] [Originator@6876 sub=vpxLro opID=46282e56] [VpxLRO] -- FINISH lro-678812 2017-04-04T23:45:00.167Z info vpxd[7F5B12448700] [Originator@6876 sub=vpxLro opID=355d9a40] [VpxLRO] -- BEGIN lro-678815 -- SearchIndex -- vim.SearchIndex.findByUuid -- 528c2de1-a332-d171-9252-1e18943e24fd(52aa0800-4d44-25f4-a201-a48c8702358d) 2017-04-04T23:45:00.168Z info vpxd[7F5B12448700] [Originator@6876 sub=vpxLro opID=355d9a40] [VpxLRO] -- FINISH lro-678815 2017-04-04T23:45:00.184Z error vpxd[7F5B135EB700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000 2017-04-04T23:45:00.218Z error vpxd[7F5B11FBF700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000 2017-04-04T23:45:00.264Z error vpxd[7F5B13162700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000 2017-04-04T23:45:00.268Z error vpxd[7F5B38267700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000 2017-04-04T23:45:00.300Z error vpxd[7F5B12F5E700] [Originator@6876 sub=HTTP session map] Out of HTTP sessions: Limited to 2000

prashima commented 7 years ago

As discussed offline, we need to add session management inside kops code, so that vsphere cloud doesn't end up creating a lot of connections. But also continue the investigation if (govmomi based) vsphere cloud is the real culprit here. Because the number of cluster created is not really matching the 2000 connection limit.

abrarshivani commented 7 years ago

@prashima I think it's not vSphere CloudProvider code which creates lots of connections it is kops code. Logs:

I0405 20:39:19.295304   96750 create_cluster.go:295] networking mode=flannel => {"flannel":{}}
I0405 20:39:19.296648   96750 create_cluster.go:677] Using SSH public key: /Users/shivania/.ssh/id_rsa.pub
I0405 20:39:19.296683   96750 vsphere_cloud.go:71] Creating vSphere Cloud with server(10.160.97.44), datacenter(VSAN-DC), cluster(VSAN-Cluster)
I0405 20:39:19.296701   96750 vsphere_cloud.go:85] Creating vSphere Cloud URL is https://10.160.97.44/sdk
I0405 20:39:19.759995   96750 vsphere_cloud.go:99] Created vSphere Cloud successfully: &{Server:10.160.97.44 Datacenter:VSAN-DC Cluster:VSAN-Cluster Username:administrator@vsphere.local Password:Admin!23 Client:0xc4203129e0 CoreDNSServer:http://10.192.217.24:2379 DNSZone:skydns.local}
I0405 20:39:19.760109   96750 subnets.go:183] Assigned CIDR 172.20.32.0/19 to subnet dummy2a
I0405 20:39:19.761423   96750 populate_cluster_spec.go:343] Defaulted KubeControllerManager.ClusterCIDR to 100.96.0.0/11
I0405 20:39:19.761445   96750 populate_cluster_spec.go:350] Defaulted ServiceClusterIPRange to 100.64.0.0/13
I0405 20:39:19.761453   96750 vsphere_cloud.go:71] Creating vSphere Cloud with server(10.160.97.44), datacenter(VSAN-DC), cluster(VSAN-Cluster)
I0405 20:39:19.761468   96750 vsphere_cloud.go:85] Creating vSphere Cloud URL is https://10.160.97.44/sdk
I0405 20:39:20.130442   96750 vsphere_cloud.go:99] Created vSphere Cloud successfully: &{Server:10.160.97.44 Datacenter:VSAN-DC Cluster:VSAN-Cluster Username:administrator@vsphere.local Password:Admin!23 Client:0xc4202b41b0 CoreDNSServer:http://10.192.217.24:2379 DNSZone:skydns.local}
I0405 20:39:20.130500   96750 subnets.go:48] All subnets have CIDRs; skipping asssignment logic
I0405 20:39:20.130564   96750 populate_cluster_spec.go:218] Normalizing kubernetes version: "v1.5.3" -> "1.5.3"
I0405 20:39:20.130575   96750 vsphere_cloud.go:71] Creating vSphere Cloud with server(10.160.97.44), datacenter(VSAN-DC), cluster(VSAN-Cluster)
I0405 20:39:20.130589   96750 vsphere_cloud.go:85] Creating vSphere Cloud URL is https://10.160.97.44/sdk
prashima commented 7 years ago

Agreed, let's take care of this for M1 be introducing the discussed session management. For now we can focus on M0 issues.

fabulous-gopher commented 7 years ago

This issue was moved to kubernetes/kops#2747