GlorifiedTypist / k3s-oracle-cloud-free-tier

Create a k3s cluster on Oracle Cloud's free for life tier
GNU General Public License v3.0
30 stars 21 forks source link

Sometimes the servertemplate.sh does not run 100% in cloud init phase #10

Closed valentinvieriu closed 3 years ago

valentinvieriu commented 3 years ago

Jus happened to me couple of times that the cloud intit scrip modules/free-tier-k3s/scripts/server.template.sh was just partially executed. I've had to ssh and manually run the remaining code:


curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--no-deploy traefik" K3S_CLUSTER_SECRET='${cluster_token}' sh -s - server --tls-san="k3s.local"

while ! nc -z localhost 6443; do
  sleep 1
done

mkdir /home/opc/.kube
cp /etc/rancher/k3s/k3s.yaml /home/opc/.kube/config
sed -i "s/127.0.0.1/$(curl -s ifconfig.co)/g" /home/opc/.kube/config
chown opc:opc /home/opc/.kube/ -R

iptables -D INPUT -i ens3 -p tcp --dport 6443 -j DROP

This locks the installation process

Here is the sudo cat /var/log/cloud-init-output.log

[opc@server ~]$ sudo cat /var/log/cloud-init-output.log
Cloud-init v. 20.3-10.0.1.el8_4.2 running 'init-local' at Tue, 06 Jul 2021 13:42:26 +0000. Up 32.95 seconds.
Cloud-init v. 20.3-10.0.1.el8_4.2 running 'init' at Tue, 06 Jul 2021 13:42:32 +0000. Up 38.36 seconds.
ci-info: ++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++
ci-info: +--------+------+-------------------------+---------------+--------+-------------------+
ci-info: | Device |  Up  |         Address         |      Mask     | Scope  |     Hw-Address    |
ci-info: +--------+------+-------------------------+---------------+--------+-------------------+
ci-info: |  ens3  | True |        10.0.1.142       | 255.255.254.0 | global | 02:00:17:06:7a:9b |
ci-info: |  ens3  | True | fe80::17ff:fe06:7a9b/64 |       .       |  link  | 02:00:17:06:7a:9b |
ci-info: |   lo   | True |        127.0.0.1        |   255.0.0.0   |  host  |         .         |
ci-info: |   lo   | True |         ::1/128         |       .       |  host  |         .         |
ci-info: +--------+------+-------------------------+---------------+--------+-------------------+
ci-info: +++++++++++++++++++++++++++Route IPv4 info++++++++++++++++++++++++++++
ci-info: +-------+-------------+----------+---------------+-----------+-------+
ci-info: | Route | Destination | Gateway  |    Genmask    | Interface | Flags |
ci-info: +-------+-------------+----------+---------------+-----------+-------+
ci-info: |   0   |   0.0.0.0   | 10.0.0.1 |    0.0.0.0    |    ens3   |   UG  |
ci-info: |   1   |   0.0.0.0   | 10.0.0.1 |    0.0.0.0    |    ens3   |   UG  |
ci-info: |   2   |   10.0.0.0  | 0.0.0.0  | 255.255.254.0 |    ens3   |   U   |
ci-info: |   3   |   10.0.0.0  | 0.0.0.0  | 255.255.254.0 |    ens3   |   U   |
ci-info: |   4   | 169.254.0.0 | 0.0.0.0  |  255.255.0.0  |    ens3   |   U   |
ci-info: |   5   | 169.254.0.0 | 0.0.0.0  |  255.255.0.0  |    ens3   |   U   |
ci-info: +-------+-------------+----------+---------------+-----------+-------+
ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
ci-info: +-------+-------------+---------+-----------+-------+
ci-info: | Route | Destination | Gateway | Interface | Flags |
ci-info: +-------+-------------+---------+-----------+-------+
ci-info: |   1   |  fe80::/64  |    ::   |    ens3   |   U   |
ci-info: |   3   |    local    |    ::   |    ens3   |   U   |
ci-info: |   4   |  multicast  |    ::   |    ens3   |   U   |
ci-info: +-------+-------------+---------+-----------+-------+
Cloud-init v. 20.3-10.0.1.el8_4.2 running 'modules:config' at Tue, 06 Jul 2021 13:42:39 +0000. Up 44.56 seconds.
Running configuration script...
+ yum update -y oracle-cloud-agent
Ksplice for Oracle Linux 8 (x86_64)             1.0 MB/s | 375 kB     00:00    
MySQL 8.0 for Oracle Linux 8 (x86_64)           5.0 MB/s | 1.6 MB     00:00    
MySQL 8.0 Tools Community for Oracle Linux 8 (x 105 kB/s | 106 kB     00:01    
MySQL 8.0 Connectors Community for Oracle Linux  37 kB/s |  14 kB     00:00    
Oracle Software for OCI users on Oracle Linux 8 9.9 MB/s | 6.9 MB     00:00    
Oracle Linux 8 BaseOS Latest (x86_64)            40 MB/s |  37 MB     00:00    
Oracle Linux 8 Application Stream (x86_64)       22 MB/s |  29 MB     00:01    
Oracle Linux 8 Addons (x86_64)                  1.5 MB/s | 212 kB     00:00    
Latest Unbreakable Enterprise Kernel Release 6   44 MB/s |  20 MB     00:00    
Last metadata expiration check: 0:00:01 ago on Tue 06 Jul 2021 01:44:37 PM GMT.
Dependencies resolved.
Nothing to do.
Complete!
+ systemctl disable firewalld --now
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
+ iptables -A INPUT -i ens3 -p tcp --dport 6443 -j DROP
+ iptables -I INPUT -i ens3 -p tcp -s 10.0.0.0/8 --dport 6443 -j ACCEPT
+ curl -sfL https://get.k3s.io
+ INSTALL_K3S_EXEC='--no-deploy traefik'
+ K3S_CLUSTER_SECRET='B:W82)1m!R_!<e3sXvC6vS6C:eLF!a$0)tvmq:q>N-}4B4tS7N3cd2wfU=!]K3wq'
+ sh -s - server --tls-san=k3s.local
[INFO]  Finding release for channel stable
[INFO]  Using v1.21.2+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.21.2+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.21.2+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
Rancher K3s Common (stable)                     1.4 kB/s | 1.2 kB     00:00    
Dependencies resolved.
==================================================================================================================
 Package                        Arch    Version                                   Repository                  Size
==================================================================================================================
Installing:
 k3s-selinux                    noarch  0.3-0.el8                                 rancher-k3s-common-stable   18 k
Installing dependencies:
 container-selinux              noarch  2:2.162.0-1.module+el8.4.0+20195+0a4a4953 ol8_appstream               52 k
 policycoreutils-python-utils   noarch  2.9-14.0.1.el8                            ol8_baseos_latest          252 k
Enabling module streams:
 container-tools                        ol8                                                                       

Transaction Summary
==================================================================================================================
Install  3 Packages

Total download size: 322 k
Installed size: 267 k
Downloading Packages:
(1/3): policycoreutils-python-utils-2.9-14.0.1. 1.1 MB/s | 252 kB     00:00    
(2/3): container-selinux-2.162.0-1.module+el8.4 220 kB/s |  52 kB     00:00    
(3/3): k3s-selinux-0.3-0.el8.noarch.rpm          23 kB/s |  18 kB     00:00    
--------------------------------------------------------------------------------
Total                                           404 kB/s | 322 kB     00:00     
warning: /var/cache/dnf/rancher-k3s-common-stable-c6e9c12b44bffb7c/packages/k3s-selinux-0.3-0.el8.noarch.rpm: Header V4 RSA/SHA1 Signature, key ID e257814a: NOKEY
Rancher K3s Common (stable)                     5.7 kB/s | 2.4 kB     00:00    
Importing GPG key 0xE257814A:
 Userid     : "Rancher (CI) <ci@rancher.com>"
 Fingerprint: C8CF F216 4551 26E9 B9C9 18BE 925E A29A E257 814A
 From       : https://rpm.rancher.io/public.key
Key imported successfully
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                        1/1 
  Installing       : policycoreutils-python-utils-2.9-14.0.1.el8.noarch     1/3 
  Running scriptlet: container-selinux-2:2.162.0-1.module+el8.4.0+20195+0   2/3 
  Installing       : container-selinux-2:2.162.0-1.module+el8.4.0+20195+0   2/3 
  Running scriptlet: container-selinux-2:2.162.0-1.module+el8.4.0+20195+0   2/3
GlorifiedTypist commented 3 years ago

This by design.

The below ensures that the local-exec provisioning command does not fetch the k3s.yaml until the server is ready and the transform to the external address is complete. The the iptables rule then allows the local-exec to fetch the transformed kubeconfig.

while ! nc -z localhost 6443; do
  sleep 1
done

This for my region takes around 10-12 mins. Are you getting a timeout? How long does this take in your region?

valentinvieriu commented 3 years ago

I'll give it a try again and report back. I remember was quite long ( 30min maybe ) the wait, so I've expected that something was wrong.

valentinvieriu commented 3 years ago

I guess you were right. After 10min worked. It had a hickup with installing the ingress. But on second run of the terraform apply worked. I think it used the previous config in the ~/.kube/k3s. The IP it tried to connect it's not the current k8s api server.

module.free-tier-k3s.null_resource.kubeconfig: Still creating... [10m10s elapsed]
module.free-tier-k3s.null_resource.kubeconfig: Still creating... [10m20s elapsed]
module.free-tier-k3s.null_resource.kubeconfig (local-exec): Warning: Permanently added '129.159.198.62' (ECDSA) to the list of known hosts.
module.free-tier-k3s.null_resource.kubeconfig: Creation complete after 10m23s [id=7162798813511390964]
module.free-tier-k3s.helm_release.nginx-ingress: Creating...
╷
│ Error: Post "https://132.145.240.93:6443/api/v1/namespaces": dial tcp 132.145.240.93:6443: i/o timeout
│ 
│   with module.free-tier-k3s.kubernetes_namespace.nginx-ingress,
│   on modules/free-tier-k3s/helm.tf line 2, in resource "kubernetes_namespace" "nginx-ingress":
│    2: resource "kubernetes_namespace" "nginx-ingress" {
│ 
╵
╷
│ Error: create: failed to create: namespaces "nginx-ingress" not found
│ 
│   with module.free-tier-k3s.helm_release.nginx-ingress,
│   on modules/free-tier-k3s/helm.tf line 8, in resource "helm_release" "nginx-ingress":
│    8: resource "helm_release" "nginx-ingress" {
│