Closed liyang516 closed 1 year ago
The install failed, but you can run rke2 -v
- so it's obviously installed. Where's the failure? I don't see even so much as an error message here.
The install failed, but you can run
rke2 -v
- so it's obviously installed. Where's the failure? I don't see even so much as an error message here.
Do you know what the problem is? @brandond
I'm not sure why iptables would segfault on your hardware; I suspect perhaps your processor model lacks something the binary expects. What is the output of cat /proc/cpuinfo
?
I'm not sure why iptables would segfault on your hardware; I suspect perhaps your processor model lacks something the binary expects. What is the output of
cat /proc/cpuinfo
I use the arm64 virtualmachine,here is the cpu info:
# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 2
Model: 0
BogoMIPS: 200.00
NUMA node0 CPU(s): 0,1
NUMA node1 CPU(s): 2,3
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
# cat /proc/cpuinfo
processor : 0
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
processor : 1
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
processor : 2
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
processor : 3
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
Which VM platform are you running it in? Can you provide steps to reproduce? This works for me on multiple physical arm64 platforms.
Which VM platform are you running it in? Can you provide steps to reproduce? This works for me on multiple physical arm64 platforms.
My vm running on OpenStack, my physical compute node system is CentOS7.8, cpu use HUAWEI Kunpeng 920 5220,
# Physical node info
# arch
aarch64
# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (AltArch)
# lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 1
Core(s) per socket: 32
Socket(s): 2
NUMA node(s): 2
Model: 0
CPU max MHz: 2600.0000
CPU min MHz: 200.0000
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 32768K
NUMA node0 CPU(s): 0-31
NUMA node1 CPU(s): 32-63
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
# cat /proc/cpuinfo
processor : 0
BogoMIPS : 200.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
...
# dmidecode -t processor
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.2.0 present.
Handle 0x001B, DMI type 4, 48 bytes
Processor Information
Socket Designation: CPU01
Type: Central Processor
Family: ARM
Manufacturer: HiSilicon
ID: 10 D0 1F 48 00 00 00 00
Signature: Implementor 0x48, Variant 0x1, Architecture 15, Part 0xd01, Revision 0
Version: HUAWEI Kunpeng 920 5220
Voltage: 0.9 V
External Clock: 100 MHz
Max Speed: 2600 MHz
Current Speed: 2600 MHz
Status: Populated, Enabled
Upgrade: Unknown
L1 Cache Handle: 0x0018
L2 Cache Handle: 0x0019
L3 Cache Handle: 0x001A
Serial Number: 6B73215401A03324
Asset Tag: To be filled by O.E.M.
Part Number: To be filled by O.E.M.
Core Count: 32
Core Enabled: 32
Thread Count: 32
Characteristics:
64-bit capable
Multi-Core
Execute Protection
Enhanced Virtualization
Power/Performance Control
OpenStack use Train release, nova libvirt related configuration
# cat /etc/nova/nova.conf
libvirt]
connection_uri = qemu:///system
cpu_mode = host-passthrough
virt_type = kvm
The following is the xml of the virtual machine
# virsh list
Id Name State
-----------------------------------
30 ubuntu-arm running
38 instance-00001ce6 running
43 instance-00001ce5 running
52 instance-00001d33 running
54 instance-00001d4a running
# virsh dumpxml 54
<domain type='kvm' id='54'>
<name>instance-00001d4a</name>
<uuid>5db3a62c-bd34-4dce-b642-290e3df6db1f</uuid>
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
<nova:package version="0.0.0-1.el7"/>
<nova:name>centos78v.arm.bjat.qianxin-inc.cn</nova:name>
<nova:creationTime>2023-09-07 06:38:29</nova:creationTime>
<nova:flavor name="kc1.large.2">
<nova:memory>8192</nova:memory>
<nova:disk>0</nova:disk>
<nova:swap>0</nova:swap>
<nova:ephemeral>0</nova:ephemeral>
<nova:vcpus>4</nova:vcpus>
</nova:flavor>
<nova:owner>
<nova:user uuid="be803e337dbb423097ab049b5af4df95">admin</nova:user>
<nova:project uuid="e93293733175465bbc00ccdf40a6f7b0">polaris-dev</nova:project>
</nova:owner>
</nova:instance>
</metadata>
<memory unit='KiB'>8388608</memory>
<currentMemory unit='KiB'>8388608</currentMemory>
<vcpu placement='static'>4</vcpu>
<cputune>
<shares>4096</shares>
<vcpupin vcpu='0' cpuset='30'/>
<vcpupin vcpu='1' cpuset='5'/>
<vcpupin vcpu='2' cpuset='41'/>
<vcpupin vcpu='3' cpuset='43'/>
<emulatorpin cpuset='5,30,41,43'/>
</cputune>
<numatune>
<memory mode='strict' nodeset='0-1'/>
<memnode cellid='0' mode='strict' nodeset='0'/>
<memnode cellid='1' mode='strict' nodeset='1'/>
</numatune>
<resource>
<partition>/machine</partition>
</resource>
<sysinfo type='smbios'>
<system>
<entry name='manufacturer'>RDO</entry>
<entry name='product'>OpenStack Compute</entry>
<entry name='version'>0.0.0-1.el7</entry>
<entry name='serial'>5db3a62c-bd34-4dce-b642-290e3df6db1f</entry>
<entry name='uuid'>5db3a62c-bd34-4dce-b642-290e3df6db1f</entry>
<entry name='family'>Virtual Machine</entry>
</system>
</sysinfo>
<os>
<type arch='aarch64' machine='virt-rhel7.6.0'>hvm</type>
<loader readonly='yes' type='pflash'>/usr/share/AAVMF/AAVMF_CODE.fd</loader>
<nvram>/var/lib/libvirt/qemu/nvram/instance-00001d4a_VARS.fd</nvram>
<boot dev='hd'/>
<smbios mode='sysinfo'/>
</os>
<features>
<acpi/>
<apic/>
<gic version='3'/>
</features>
<cpu mode='host-passthrough' check='none'>
<topology sockets='2' cores='2' threads='1'/>
<numa>
<cell id='0' cpus='0-1' memory='4194304' unit='KiB'/>
<cell id='1' cpus='2-3' memory='4194304' unit='KiB'/>
</numa>
</cpu>
<clock offset='utc'>
<timer name='pit' tickpolicy='delay'/>
<timer name='rtc' tickpolicy='catchup'/>
</clock>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none' discard='unmap'/>
<auth username='cinder'>
<secret type='ceph' uuid='fa197221-4a80-4976-a7c1-156b5fb7076e'/>
</auth>
<source protocol='rbd' name='cinder.volumes.hdd/volume-0806b304-dd14-44fc-a333-fec13a2e0826'>
<host name='10.57.37.52' port='6789'/>
<host name='10.57.37.53' port='6789'/>
</source>
<target dev='sda' bus='scsi'/>
<iotune>
<total_bytes_sec>60000000</total_bytes_sec>
<total_iops_sec>500</total_iops_sec>
</iotune>
<serial>0806b304-dd14-44fc-a333-fec13a2e0826</serial>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
</controller>
<controller type='usb' index='0' model='qemu-xhci'>
<alias name='usb'/>
<address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</controller>
<controller type='pci' index='0' model='pcie-root'>
<alias name='pcie.0'/>
</controller>
<controller type='pci' index='1' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='1' port='0x8'/>
<alias name='pci.1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
</controller>
<controller type='pci' index='2' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='2' port='0x9'/>
<alias name='pci.2'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
</controller>
<controller type='pci' index='3' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='3' port='0xa'/>
<alias name='pci.3'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
</controller>
<controller type='pci' index='4' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='4' port='0xb'/>
<alias name='pci.4'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
</controller>
<controller type='pci' index='5' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='5' port='0xc'/>
<alias name='pci.5'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
</controller>
<controller type='pci' index='6' model='pcie-root-port'>
<model name='pcie-root-port'/>
<target chassis='6' port='0xd'/>
<alias name='pci.6'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
</controller>
<interface type='ethernet'>
<mac address='fa:16:3c:24:e3:7b'/>
<target dev='tap047da446-bd'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
</interface>
<serial type='pty'>
<source path='/dev/pts/6'/>
<log file='/var/lib/nova/instances/5db3a62c-bd34-4dce-b642-290e3df6db1f/console.log' append='off'/>
<target type='system-serial' port='0'>
<model name='pl011'/>
</target>
<alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/6'>
<source path='/dev/pts/6'/>
<log file='/var/lib/nova/instances/5db3a62c-bd34-4dce-b642-290e3df6db1f/console.log' append='off'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
</console>
<input type='tablet' bus='usb'>
<alias name='input0'/>
<address type='usb' bus='0' port='1'/>
</input>
<input type='keyboard' bus='usb'>
<alias name='input1'/>
<address type='usb' bus='0' port='2'/>
</input>
<graphics type='vnc' port='5904' autoport='yes' listen='0.0.0.0'>
<listen type='address' address='0.0.0.0'/>
</graphics>
<video>
<model type='virtio' heads='1' primary='yes'/>
<alias name='video0'/>
<address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
</video>
<memballoon model='virtio'>
<stats period='10'/>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
</memballoon>
</devices>
<seclabel type='dynamic' model='dac' relabel='yes'>
<label>+0:+0</label>
<imagelabel>+0:+0</imagelabel>
</seclabel>
</domain>
I faced the same issue in version v1.28.1+rke2r1 on arm64, but is works fine in x86 machine.
I faced the same issue in version v1.28.1+rke2r1 on arm64, but is works fine in x86 machine.
Which operating system is used?
The issue seems related to the iptables installed on the machine. Could you check if the iptables binary is build for arm64?
I faced the same issue in version v1.28.1+rke2r1 on arm64, but is works fine in x86 machine.
Which operating system is used? NAME="CentOS Linux" VERSION="7 (AltArch)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (AltArch)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"
The version of iptables: iptables v1.4.21
I also reproduced this issue without kubernetes
[root@my-vm ~]# docker run -it --entrypoint=bash rancher/hardened-kubernetes:v1.28.1-rke2r1-build20230825
c5cfe3404ad8:/ # iptables --version
/usr/sbin/iptables: line 65: awk: command not found
Segmentation fault
c5cfe3404ad8:/ # iptables --version
Segmentation fault
@rbrtbnfgl our hardened-kubernetes image actually includes iptables binaries from k3s-root: https://github.com/rancher/image-build-kubernetes/blob/master/Dockerfile#L58-L61
I suspect we need to bump this to v0.12.2 or newer for the 64k page size fix.
We closed this in k3s after community validation, and the fix is the same here, so I am going to close it out here with the same reasoning per https://github.com/k3s-io/k3s/issues/7335#issuecomment-1529916982. If this does not resolve the issue, please let me know and we can work towards a better fix and getting an environment where we can reproduce it. Thank you!
@rancher-max rke2 version v1.28.2+rke2r1 still got "iptables Segmentation fault" error, but I verified this tar package k3s-tools-arm , it works fine on CentOS 7, It seems that the ARCH parameter in line 57 of the Dockerfile should be changed to a variable.
It seems that the ARCH parameter in line 57 of the Dockerfile should be changed to a variable.
It's a Docker ARG, which is a variable passed in to the Dockerfile at build time... that is how Docker args work. https://github.com/rancher/image-build-kubernetes/blob/master/Makefile#L36
Can you confirm which iptables binary specifically is segfaulting? I suspect there may be another binary embedded somewhere that is not usable on your platform.
@brandond Here is a detailed binary comparison
docker run -it -d -v /root/k3s-root-v0.12.1:/root/k3s-root-v0.12.1 -v /root/k3s-root-v0.12.2:/root/k3s-root-v0.12.2 -v /root/k3s-root-v0.13.0:/root/k3s-root-v0.13.0 --entrypoint=bash rancher/hardened-kubernetes:v1.28.2-rke2r1-build20230913
[root@test ~]# docker exec -it aff16dc54a22 /bin/bash
aff16dc54a22:~ # ls -al /usr/sbin/iptables
lrwxrwxrwx 1 root root 17 Oct 16 04:06 /usr/sbin/iptables -> xtables-nft-multi
aff16dc54a22:~ # md5sum /usr/sbin/xtables-nft-multi
ea4d47cd148cd0d0bab7586f28636cb4 /usr/sbin/xtables-nft-multi
aff16dc54a22:~ # md5sum /root/k3s-root-v0.12.1/bin/aux/xtables-nft-multi
ea4d47cd148cd0d0bab7586f28636cb4 /root/k3s-root-v0.12.1/bin/aux/xtables-nft-multi
aff16dc54a22:~ # md5sum /root/k3s-root-v0.12.2/bin/aux/xtables-nft-multi
fa36e7fb616aa85b7298481493e58a66 /root/k3s-root-v0.12.2/bin/aux/xtables-nft-multi
aff16dc54a22:~ # md5sum /root/k3s-root-v0.13.0/bin/aux/xtables-nft-multi
966a67cb630421221887c448256b57ad /root/k3s-root-v0.13.0/bin/aux/xtables-nft-multi
aff16dc54a22:~ # /usr/sbin/xtables-nft-multi iptables --version
Segmentation fault
aff16dc54a22:~ # /root/k3s-root-v0.12.1/bin/aux/xtables-nft-multi iptables --version
Segmentation fault
aff16dc54a22:~ # /root/k3s-root-v0.12.2/bin/aux/xtables-nft-multi iptables --version
bash: /root/k3s-root-v0.12.2/bin/aux/xtables-nft-multi: cannot execute binary file: Exec format error
aff16dc54a22:~ #
aff16dc54a22:~ # /root/k3s-root-v0.13.0/bin/aux/xtables-nft-multi iptables --version
iptables v1.8.8 (nf_tables)
it turns out k3s-root-arm (version: v0.12.1)'s binary xtables-nft-multi md5sum is the same as rke2 (version:v1.28.2+rke2r1) image, and they all got Segmentation fault since k3s-root-arm (version: v0.13.0) works on my arm platform well, it means that these binarys can work well on my platform. and then for rke2 (version: v1.28.2+rke2r1) , it seems like maybe the k3s-root-arm version is mismatched. Can you help check why rke2 version v1.28.2+rke2r1 is not using package k3s-root-arm version v0.13.0 ?or just give rke2 a matching version of k3s-root-arm.
Hmm. https://github.com/rancher/image-build-kubernetes/releases/tag/v1.28.2-rke2r1-build20230913 shows that it was built against https://github.com/rancher/image-build-kubernetes/commit/c29ac4f85b77e19ace2fb4771c3a091b3bb14afa which has the updated version... I'll have to see if that is perhaps also set elsewhere.
oh, derp - we also define it here... and this one takes precedence https://github.com/rancher/image-build-kubernetes/blob/master/Makefile#L18
Will need to be tested once we have hardened-kubernetes images tagged for 1.28.3
Validated on master branch with RC
v1.28.3-rc2+rke2r1
NAME="CentOS Linux"
VERSION="7 (AltArch)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (AltArch)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7:server"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
Config.yaml:
token: secret
write-kubeconfig-mode: "0644"
profile: "cis"
Cluster Configuration:
1 server
Testing Steps
Copy config.yaml
$ sudo mkdir -p /etc/rancher/rke2 && sudo cp config.yaml /etc/rancher/rke2
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cloud-controller-manager-ip-172-31-41-105.us-east-2.compute.internal 1/1 Running 0 14m
kube-system etcd-ip-172-31-41-105.us-east-2.compute.internal 1/1 Running 0 14m
kube-system helm-install-rke2-canal-66hpq 0/1 Completed 0 13m
kube-system helm-install-rke2-coredns-v7xlf 0/1 Completed 0 13m
kube-system helm-install-rke2-ingress-nginx-26k6g 0/1 Completed 0 13m
kube-system helm-install-rke2-metrics-server-n7vmb 0/1 Completed 0 13m
kube-system helm-install-rke2-snapshot-controller-crd-vxfmw 0/1 Completed 0 13m
kube-system helm-install-rke2-snapshot-controller-np944 0/1 Completed 1 13m
kube-system helm-install-rke2-snapshot-validation-webhook-vdjk5 0/1 Completed 0 13m
kube-system kube-apiserver-ip-172-31-41-105.us-east-2.compute.internal 1/1 Running 0 14m
kube-system kube-controller-manager-ip-172-31-41-105.us-east-2.compute.internal 1/1 Running 0 14m
kube-system kube-proxy-ip-172-31-41-105.us-east-2.compute.internal 1/1 Running 0 14m
kube-system kube-scheduler-ip-172-31-41-105.us-east-2.compute.internal 1/1 Running 0 14m
kube-system rke2-canal-nfnl4 2/2 Running 0 13m
kube-system rke2-coredns-rke2-coredns-6b795db654-f7k9q 1/1 Running 0 13m
kube-system rke2-coredns-rke2-coredns-autoscaler-945fbd459-pdbj5 1/1 Running 0 13m
kube-system rke2-ingress-nginx-controller-t7w2x 1/1 Running 0 12m
kube-system rke2-metrics-server-544c8c66fc-fqcjm 1/1 Running 0 12m
kube-system rke2-snapshot-controller-59cc9cd8f4-mtvfr 1/1 Running 0 12m
kube-system rke2-snapshot-validation-webhook-54c5989b65-qsm4r 1/1 Running 0 12m
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-41-105.us-east-2.compute.internal Ready control-plane,etcd,master 14m v1.28.3+rke2r1
Environmental Info: RKE2 Version:
OS info:
k8s Version:
Install result:
Pod log:
Describe the problem:
Steps To Reproduce:
Expected behavior: expect the server node to be ready
Other attempts: I use the same install step on ubuntu-arm64-22.04, it works