Closed s0komma closed 5 years ago
@s0komma what's the name of the vm in openstack server list
? use --hostname-override
in kubelet and set that to the same name you see in the server list command. Also check your metadata service or config drive to make sure, you are seeing the same host name there too. hopefully you don't have any special characters in the name. Another thing to do is bump up the verbository of the logging --v=10
on kubelet and kube-apiserver to see more messages about any mismatches.
@dims -- Thanks for responding .. My host name is different from what i see in openstack server list
root@control-plane-256024810-1-359235278:~/skomma# hostname -f
control-plane-256024810-1-359235278.lab.skomma.kubernetes
root@control-plane-256024810-1-359235278:~/skomma# openstack server list | grep skomma
| a586801d-237c-44e6-a883-9194bcac052b | control-plane-lab-skomma-kubernetes-359235272 | ACTIVE | Primary5_External_Net=10.1.1.1 |```
added below in kubelet
`--hostname-override=control-plane-lab-skomma-kubernetes-359235272 \`
Below are kubelet logs after bumping up the verbository of the logging `--v=10`
`May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584914 30009 flags.go:52] FLAG: --v="10"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584918 30009 flags.go:52] FLAG: --version="false"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584926 30009 flags.go:52] FLAG: --vmodule=""
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584930 30009 flags.go:52] FLAG: --volume-plugin-dir="/usr/libexec/kubernetes/kubelet-plugins/volume/exec/"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584936 30009 flags.go:52] FLAG: --volume-stats-agg-period="1m0s"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584969 30009 feature_gate.go:226] feature gates: &{{} map[]}
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584999 30009 controller.go:114] kubelet config controller: starting controller
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.585005 30009 controller.go:118] kubelet config controller: validating combination of defaults and flags
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.607136 30009 mount_linux.go:208] Detected OS with systemd
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.612602 30009 iptables.go:589] couldn't get iptables-restore version; assuming it doesn't support --wait
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.614049 30009 server.go:182] Version: v1.9.4
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.614101 30009 feature_gate.go:226] feature gates: &{{} map[]}
May 11 04:50:22 control-plane-256024810-1-359235278 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=kubelet comm="systemd" exe="/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 11 04:50:22 control-plane-256024810-1-359235278 kubelet[24510]: I0511 04:50:22.036864 24510 server.go:305] Successfully initialized cloud provider: "openstack" from the config file: "/etc/kubernetes/cloud_config"
May 11 04:50:22 control-plane-256024810-1-359235278 kubelet[24510]: I0511 04:50:22.036915 24510 openstack_instances.go:39] openstack.Instances() called
May 11 04:50:22 control-plane-256024810-1-359235278 kubelet[24510]: error: failed to run Kubelet: failed to get instances from cloud provider
May 11 04:50:22 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
May 11 04:50:22 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Unit entered failed state.
May 11 04:50:22 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Failed with result 'exit-code'.
root@control-plane-256024810-1-359235278:~/skomma# openstack server show a586801d-237c-44e6-a883-9194bcac052b
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | az2 |
| OS-EXT-SRV-ATTR:host | xxxxx |
| OS-EXT-SRV-ATTR:hypervisor_hostname | xxxxxxcom |
| OS-EXT-SRV-ATTR:instance_name | instance-0002c8c5 |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2018-05-08T18:01:43.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | Primary5_External_Net=10.1q.q.q |
| config_drive | True |
| created | 2018-05-08T18:01:32Z |
| flavor | m1.medium (3) |
| hostId | 9e7065dcf87822477a26518116ff2f5f5d92b6e5c4536dff8e26bea1 |
| id | a586801d-237c-44e6-a883-9194bcac052b |
| image | Ubuntu-16.04-minimal-cloud-init RC10 (4fbebc68-9853-4c0a-9a41-e2aa1afaf7d4) |
| key_name | oneops_key-359232257-lab-359232294 |
| name | control-plane-lab-skomma-kubernetes-359235272 |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | b938d5464ece4d02876f9ec8db110bf7 |
| properties | assembly='skomma', component='359232309', environment='lab', instance='359235272', mgmt_url='https://web..com', organization='kubernetes', owner='', platform='control-plane' |
| security_groups | [{u'name': u'default'}, {u'name': u'control-plane-lab-skomma-kubernetes-359235182'}] |
| status | ACTIVE |
| updated | 2018-05-08T18:01:43Z |
| user_id | xxxxxxxxx |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+`
This the error i se in the end
May 11 05:00:04 control-plane-256024810-1-359235278 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=kubelet comm="systemd" exe="/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
@s0komma
@dims -- Is the keystone url accessible from this VM? -- Yes i'm doing nova list and others from the Vm only
If you print the keystone catalog, you will see the url for Nova, is the hostname used in the url resolvable? -- I know for sure that the name in OpenStack is different from the name of VM and the OpenStack name is not dns resolvable.. Should the name in nova list
be the same as the VM or just DNS entry is good .?
of you are running openstack CLI and Nova CLI from the vm, then you should be ok in terms of the VM being able to reach the Nova and Keystone API endpoints.
the name of the vm in the nova list
should be used as the --hostname-override
when you start kubelet. Have you done that already?
Thank you @dims
the name of the vm in the nova list should be used as the --hostname-override when you start kubelet. Have you done that already? -- YES
root@control-plane-256024810-1-359235278:~/skomma# hostname -f
control-plane-256024810-1-359235278.lab.skomma.kubernetes
root@control-plane-256024810-1-359235278:~/skomma# openstack server list | grep skomma
| a586801d-237c-44e6-a883-9194bcac052b | control-plane-lab-skomma-kubernetes-359235272 | ACTIVE | Primary5_External_Net=10.1.1.1 |```
added below in kubelet
`--hostname-override=control-plane-lab-skomma-kubernetes-359235272 \`
Below are kubelet logs after bumping up the verbository of the logging `--v=10`
`May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584914 30009 flags.go:52] FLAG: --v="10"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584918 30009 flags.go:52] FLAG: --version="false"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584926 30009 flags.go:52] FLAG: --vmodule=""
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584930 30009 flags.go:52] FLAG: --volume-plugin-dir="/usr/libexec/kubernetes/kubelet-plugins/volume/exec/"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584936 30009 flags.go:52] FLAG: --volume-stats-agg-period="1m0s"
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584969 30009 feature_gate.go:226] feature gates: &{{} map[]}
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.584999 30009 controller.go:114] kubelet config controller: starting controller
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.585005 30009 controller.go:118] kubelet config controller: validating combination of defaults and flags
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.607136 30009 mount_linux.go:208] Detected OS with systemd
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.612602 30009 iptables.go:589] couldn't get iptables-restore version; assuming it doesn't support --wait
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.614049 30009 server.go:182] Version: v1.9.4
May 11 02:09:26 control-plane-256024810-1-359235278 kubelet[30009]: I0511 02:09:26.614101 30009 feature_gate.go:226] feature gates: &{{} map[]}
May 11 04:50:22 control-plane-256024810-1-359235278 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=kubelet comm="systemd" exe="/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 11 04:50:22 control-plane-256024810-1-359235278 kubelet[24510]: I0511 04:50:22.036864 24510 server.go:305] Successfully initialized cloud provider: "openstack" from the config file: "/etc/kubernetes/cloud_config"
May 11 04:50:22 control-plane-256024810-1-359235278 kubelet[24510]: I0511 04:50:22.036915 24510 openstack_instances.go:39] openstack.Instances() called
May 11 04:50:22 control-plane-256024810-1-359235278 kubelet[24510]: error: failed to run Kubelet: failed to get instances from cloud provider
May 11 04:50:22 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
May 11 04:50:22 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Unit entered failed state.
May 11 04:50:22 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Failed with result 'exit-code'.
root@control-plane-256024810-1-359235278:~/skomma# openstack server show a586801d-237c-44e6-a883-9194bcac052b
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | az2 |
| OS-EXT-SRV-ATTR:host | xxxxx |
| OS-EXT-SRV-ATTR:hypervisor_hostname | xxxxxxcom |
| OS-EXT-SRV-ATTR:instance_name | instance-0002c8c5 |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2018-05-08T18:01:43.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | Primary5_External_Net=10.1q.q.q |
| config_drive | True |
| created | 2018-05-08T18:01:32Z |
| flavor | m1.medium (3) |
| hostId | 9e7065dcf87822477a26518116ff2f5f5d92b6e5c4536dff8e26bea1 |
| id | a586801d-237c-44e6-a883-9194bcac052b |
| image | Ubuntu-16.04-minimal-cloud-init RC10 (4fbebc68-9853-4c0a-9a41-e2aa1afaf7d4) |
| key_name | oneops_key-359232257-lab-359232294 |
| name | control-plane-lab-skomma-kubernetes-359235272 |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | b938d5464ece4d02876f9ec8db110bf7 |
| properties | assembly='skomma', component='359232309', environment='lab', instance='359235272', mgmt_url='https://web..com', organization='kubernetes', owner='', platform='control-plane' |
| security_groups | [{u'name': u'default'}, {u'name': u'control-plane-lab-skomma-kubernetes-359235182'}] |
| status | ACTIVE |
| updated | 2018-05-08T18:01:43Z |
| user_id | xxxxxxxxx |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+`
so let's look at the code now, we can see the openstack.Instances() called
printed at the following line:
https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/cloudprovider/providers/openstack/openstack_instances.go#L39
but not the Claiming to support Instances
from line 46:
https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/cloudprovider/providers/openstack/openstack_instances.go#L46
Unfortunately we are missing printing the actual err
we get from os.NewComputeV2()
So basically this points to something missing or problematic in your cloud config file. Are you able to make a local change to the code and try digging in?
This is what i see in logs
Are you able to make a local change to the code and try digging in? -- Yes, would be of great help with you guide me through this
May 11 16:41:37 control-plane-256024810-1-359235278 kubelet[32257]: I0511 16:41:37.739591 32257 openstack_instances.go:39] openstack.Instances() called
May 11 16:41:37 control-plane-256024810-1-359235278 kubelet[32257]: error: failed to run Kubelet: failed to get instances from cloud provider
May 11 16:41:37 control-plane-256024810-1-359235278 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=kubelet comm="systemd" exe="/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 11 16:41:37 control-plane-256024810-1-359235278 audispd[585]: node=control-plane-256024810-1-359235278 type=SERVICE_STOP msg=audit(1526056897.740:421506): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=kubelet comm="systemd" exe="/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'`
below is what i see /etc/cloud/
root@control-plane-256024810-1-359235278:~# cat /etc/cloud/cloud.cfg
preserve_hostname: true
root@control-plane-256024810-1-359235278:~# cat /etc/cloud/cloud.cfg.d/
05_logging.cfg 90_dpkg.cfg 99_hostname.cfg README
root@control-plane-256024810-1-359235278:~# cat /etc/cloud/cloud.cfg.d/99_hostname.cfg
hostname: control-plane-256024810-1-359235278
fqdn: control-plane-256024810-1-359235278.lab.skomma.kubernetes.com
so the cloud config
file is the file that you pass to the kubelet or apiserver using --cloud-config
parameter. can you please cross check what's in there?
on the code change, just add new glog.Infof()
statement and print the err message.
Below is the cloud_config file passed to kubelet
root@control-plane-256024810-1-359235278:/home/app# cat /etc/kubernetes/cloud_config
[Global]
region=region
username=kubernetes
password=******
auth-url=https://api-endpoint.com:5000/v3
tenant-id=********
domain-name=Default
@dims -- since i'm using in-tree version as of now, do you suggest to try the same test with CSI so that there will be more flexibility on tweaking things
@s0komma you are using 1.9.x, so it's better to stick to that version. the CSI stuff is very much in progress and may not work with older releases
@dims -- Thank you very much i have updated the kubelet with you
May 11 18:15:18 control-plane-256024810-1-359235278 kubelet[1789]: I0511 18:15:18.229454 1789 server.go:305] Successfully initialized cloud provider: "openstack" from the config file: "/etc/kubernetes/cloud_config"
May 11 18:15:18 control-plane-256024810-1-359235278 kubelet[1789]: I0511 18:15:18.229496 1789 openstack_instances.go:39] openstack.Instances() called
May 11 18:15:18 control-plane-256024810-1-359235278 kubelet[1789]: E0511 18:15:18.229512 1789 openstack_instances.go:43] unable to access compute v2 API : failed to find compute v2 endpoint for region ndc5: No suitable endpoint could be found in the service catalog.
May 11 18:15:18 control-plane-256024810-1-359235278 kubelet[1789]: error: failed to run Kubelet: failed to get instances from cloud provider
May 11 18:15:18 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
May 11 18:15:18 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Unit entered failed state.
@s0komma - so i have no idea how your openstack environment is set up. please review the code here i think this is where your error is coming from. cross check your entries from openstack service list
for the specific region? and look at the code below.
@dims -- few updated, i know that the user account i have doesnt have access to do openstack endpoint list
and openstack service list
so i'm try to get that access as well added to the user
@dims -- now able to run below commands as kubernetes
user as mentioned in my cloud-config
file
root@control-plane-256024810-1-359235278:~/# openstack catalog list
+-----------+-----------+------------------------------------------------------------------------------------------------------+
| Name | Type | Endpoints |
+-----------+-----------+------------------------------------------------------------------------------------------------------+
| cinder | volume | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:8776/v1/xxxxxx |
| | | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:8776/v1/xxxxxx |
| | | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:8776/v1/xxxxx |
| | | |
| placement | placement | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:8780/placement |
| | | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:8780/placement |
| | | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:8780/placement |
| | | |
| nova | compute | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:8774/v2.1/xxxxx |
| | | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:8774/v2.1/xxxx |
| | | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:8774/v2.1/xxxxx |
| | | |
| neutron | network | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:9696 |
| | | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:9696 |
| | | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:9696 |
| | | |
| cinderv2 | volumev2 | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:8776/v2/xxxxx |
| | | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:8776/v2/xxxx |
| | | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:8776/v2/xxxx |
| | | |
| keystone | identity | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:5000/v3 |
| | | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:35357/v3 |
| | | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:5000/v3 |
| | | |
| cinderv3 | volumev3 | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:8776/v3/xxxx |
| | | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:8776/v3/xxxx |
| | | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:8776/v3/xxxx |
| | | |
| glance | image | RegionOne |
| | | public: https://api-endpoint.prod.walmart.com:9292 |
| | | RegionOne |
| | | admin: https://api-endpoint.prod.walmart.com:9292 |
| | | RegionOne |
| | | internal: https://api-endpoint.prod.walmart.com:9292 |
| | | |
+-----------+-----------+------------------------------------------------------------------------------------------------------+
root@control-plane-256024810-1-359235278:~/# openstack service list
+----------------------------------+-----------+-----------+
| ID | Name | Type |
+----------------------------------+-----------+-----------+
| 1234 | cinder | volume |
| 1234 | placement | placement |
| 1234 | nova | compute |
| 1234 | neutron | network |
| 1234 | cinderv2 | volumev2 |
| 1234 | keystone | identity |
| 1234 | cinderv3 | volumev3 |
| 1234 | glance | image |
all of the above api urls i'm able to reach from VM.
@s0komma - the region in the catalog seems to be RegionOne
and you have region=region
in your /etc/kubernetes/cloud_config
file
@dims
Thank you very much for pointing it out, looks like we have 2 regions OS_REGION= region
and NOVA_REGION_NAME=RegionOne
. after keeping the region name and using --hostname-override
kubelet is up and running now. Below are the logs for it
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.349666 10537 kuberuntime_manager.go:571] computePodActions got {KillPod:false CreateSandbox:false SandboxID:aa1c4985578b8377bd731a06e2e72884673aef446a737e05b1f2fb0533ff860e Attempt:0 NextInitContainerToStart:nil ContainersToStart:[] ContainersToKill:map[]} for pod "kube-controller-manager-control-plane-lab-skomma-kubernetes-359235272_kube-system(9fbec4125504e548fe3cd42707da48c4)"
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.443587 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.463451 10537 openstack_instances.go:39] openstack.Instances() called
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.463500 10537 openstack_instances.go:47] Claiming to support Instances
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.493328 10537 openstack_instances.go:39] openstack.Instances() called
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.493356 10537 openstack_instances.go:47] Claiming to support Instances
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.518508 10537 openstack_instances.go:39] openstack.Instances() called
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.518886 10537 openstack_instances.go:47] Claiming to support Instances
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.543647 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.643617 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.743764 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.843550 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.923502 10537 generic.go:183] GenericPLEG: Relisting
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.928719 10537 kubelet_pods.go:1369] Generating status for "kube-controller-manager-control-plane-lab-skomma-kubernetes-359235272_kube-system(9fbec4125504e548fe3cd42707da48c4)"
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.928822 10537 kubelet_node_status.go:273] Setting node annotation to enable volume controller attach/detach
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.928833 10537 openstack_instances.go:39] openstack.Instances() called
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.928844 10537 openstack_instances.go:47] Claiming to support Instances
May 12 02:08:03 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:03.943623 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.043717 10537 config.go:99] Looking for [api file], have seen map[file:{}]
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132019 10537 kubelet_node_status.go:329] Adding node label from cloud provider: beta.kubernetes.io/instance-type=3
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132064 10537 openstack.go:573] Claiming to support Zones
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132074 10537 openstack.go:587] Current zone is {az2 RegionOne}
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132101 10537 kubelet_node_status.go:340] Adding node label from cloud provider: failure-domain.beta.kubernetes.io/zone=az2
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132114 10537 kubelet_node_status.go:344] Adding node label from cloud provider: failure-domain.beta.kubernetes.io/region=RegionOne
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132126 10537 openstack_instances.go:39] openstack.Instances() called
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132136 10537 openstack_instances.go:47] Claiming to support Instances
May 12 02:08:04 control-plane-256024810-1-359235278 kubelet[10537]: I0512 02:08:04.132140 10537 openstack_instances.go:70] NodeAddresses(control-plane-lab-skomma-kubernetes-359235272) called
@dims the kubelet is up, then noticed that api-server is not coming up then noticed below error
--external-hostname was not specified. Trying to get it from the cloud provider
so added --external-hostname=control-plane-lab-skomma-kubernetes-359235272
as this is the name i see in openstack server list
and then API server came up and controller is also up and running. but still the nodes status remains in NotReady
. Upon looking in to logs i see below msg. Any guidance regarding this would be great
May 12 02:30:17 control-plane-256024810-1-359235278 kubelet[18650]: E0512 02:30:17.448773 18650 kubelet_node_status.go:106] Unable to register node "control-plane-lab-skomma-kubernetes-359235272" with API server: nodes "control-plane-lab-skomma-kubernetes-359235272" is forbidden: node "control-plane-256024810-1-359235278" cannot modify node "control-plane-lab-skomma-kubernetes-359235272"
If you already bootstrapped the node, and it requested/created a client certificate for its original name, then you changed the name of the node (via cloud provider changes), you should delete the now-mismatched --kubeconfig
file containing the credentials for the original node name and let it re-bootstrap
@liggitt -- Thanks for that details, let me re-bootstrap with the name found under cloud provider and see what happens..
@liggitt & @dims -- i have few specific questions on how this would work really.
hostname-1
host-1-21
kubectl get nodes
we actually see it as host-1-21
if we re-bootstrap the node and use the name thats instance name in openstack hostname-1
, then kubectl get nodes
would see the node as hostname-1
. whats the best way to go over it .?
any guidance for this would be great
@dims it appears incompatible changes were made to the openstack cloud provider in 1.10 in https://github.com/kubernetes/kubernetes/pull/58502
previously, instance name was used as the node name. That PR changed it to hostname (and a later PR made additional changes in https://github.com/kubernetes/kubernetes/pull/61000). The intent of those PRs was to ensure the attribute used was always a valid node name, but an unintended side effect was that it broke setups that named their instances intentionally and were previously working.
It is fine to have an option to alter the attribute used, but changes should be backwards compatible and not break pre-1.10 deployments.
Options I could see:
@liggitt right. it's being tracked here - https://github.com/kubernetes/kubernetes/issues/62295
Also, @s0komma is in 1.9.x so he has not hot the problem in 1.10 yet :)
The version i'm currently using is 1.9.4
@liggitt & @dims -- wanted to update you guys on what i have done so far
As suggested i tried to re-bootstrap an worker node
kubelet
on the worker node and did kubectl delete node host-1-21
on master--kubeconfig
since we were initially using system:node:host-1-21
i updated my certs to use system:node:hostname-1
( openstack instance name )kubectl get nodes
we are seeing the new worker node registered as hostname-1
( openstack instance name )Note: i did not have to use --hostname-override=hostname-1
in kubelet since i was registering the node with openstack server name
so now the question is when we are building kubelet and registering the sever to control plane we want it to be registered using the actual host name, so that when we do kubectl get nodes
we actually see it as host-1-21
( host name ) instead of hostname-1
( openstack instance name )
@liggitt & @dims -- any thought on the above point .?
when we are building kubelet and registering the sever to control plane we want it to be registered using the actual host name, so that when we do kubectl get nodes we actually see it as host-1-21 ( host name ) instead of hostname-1 ( openstack instance name )
I believe the only way to do that with the openstack cloud provider in 1.9 is to make your openstack instance name identical to the openstack hostname
@liggitt -- Thank you for the update, is this something thats an option in future version .?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
@fejta-bot: Closing this issue.
Is this a BUG REPORT or FEATURE REQUEST?:
What happened: Hi Guys, i’m trying to setup k8s persistent storage using cinder. my k8s version is
1.9.4
followed all the steps we added below details on all the nodes of k8s including master and worker--cloud-provider=openstack --cloud-config=/etc/kubernetes/cloud_config
--> the added to kubelet, controller and api server on all master nodes and only for kubelet on worker nodes (edited) after doing so kubelet won’t startKubelet logs are below
May 10 20:10:00 control-plane-256024810-1-359235278 kubelet[12070]: I0510 20:10:00.036796 12070 server.go:305] Successfully initialized cloud provider: "openstack" from the config file: "/etc/kubernetes/cloud_config" May 10 20:10:00 control-plane-256024810-1-359235278 kubelet[12070]: I0510 20:10:00.036907 12070 openstack_instances.go:39] openstack.Instances() called May 10 20:10:00 control-plane-256024810-1-359235278 kubelet[12070]: error: failed to run Kubelet: failed to get instances from cloud provider May 10 20:10:00 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE May 10 20:10:00 control-plane-256024810-1-359235278 systemd[1]: kubelet.service: Unit entered failed state.
tried to see if nova list and cinder list are working from the vm and it does with same credentials
What you expected to happen: Kubelet to come up How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
uname -a
): inux control-plane-256024810-1-359235278 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux