kahkhang / kube-linode

:whale: Provision a Kubernetes/CoreOS cluster on Linode
MIT License
212 stars 31 forks source link

Time to provision? #4

Closed codekoala closed 7 years ago

codekoala commented 7 years ago

Hi, this project looks fabulous. Thank you for putting it together!

I tried to use it a couple times today, but I feel like it's a bit stuck. Do you have an estimate for about how long it should take to provision each linode? I let it run twice, each time for over an hour, with no end in sight simply setting up the master node.

image

I'll leave it running a bit longer this time just to see if I'm too impatient.

kahkhang commented 7 years ago

Hi! Thanks for the issue! Unfortunately, sometimes the install script might get stuck downloading the CoreOS image file. From my experience, it should take no more than 15 minutes. If it takes longer, try restarting the script for now and the script will try to reinstall again. Thanks for filing the issue!

I've been experiencing this issue intermittently as well, might be due to the CoreOS install script curling with no timeout, and I'll try to investigate it when I find time. Thanks for raising this up! :)

kahkhang commented 7 years ago

Sorry I just realized it's stuck in creating disk from stackscript, which I've not encountered before. Could I get the parameters that you passed into the script that might replicate this error? (I.e. data center, plans) . is it always stuck at that stage?

codekoala commented 7 years ago

Thank you for the quick response! I just checked the instance I left running earlier, and it's still spinning on the same step that it has stopped at each time for me. Looks like it has been about 5.5 hours!

Here is the relevant info from my settings.env file:

DATACENTER_ID=6
MASTER_PLAN=4
WORKER_PLAN=4
NO_OF_WORKERS=3
codekoala commented 7 years ago

I tried it a couple more times with different settings, and each combination stayed spinning on the same step. I tried creating them in the Fremont, CA datacenter (DATACENTER_ID=3, iirc), and I tried creating them in Newark, NJ (DATACENTER_ID=6) with the 1GB plan. No love :(

kahkhang commented 7 years ago

Sorry I just tried Newark, NJ, and it seems to be booting. Could I ask what OS you are running this script on?

screen shot 2017-06-28 at 11 19 56 am screen shot 2017-06-28 at 11 19 14 am

I'm sorry I'm not sure what really happened so I can't really offer much help at this point, but if you'd like could you run rm -rf ~/.kube-linode along with the install instructions curl -s https://raw.githubusercontent.com/kahkhang/kube-linode/master/install.sh | bash again, then see if this error still occurs. You could also try placing an echo here https://github.com/kahkhang/kube-linode/blob/master/utilities.sh#L402 and seeing if the stackscript parameters json are getting parsed correctly.

You can also try filing a Linode support ticket to verify that it is not a Linode infra issue. Thanks, and sorry for the trouble!

codekoala commented 7 years ago

Thank you for the update! I'm running it on an up-to-date install of Arch Linux. I'll try it on some other distros to see if that helps and update this issue with any findings.

kahkhang commented 7 years ago

Ah im sorry, it might have failed at some point because I was placing >/dev/null 2&1 everywhere and silencing errors. If you'd like you could remove this in ~/.kube-linode/kube-linode.sh and ~/.kube-linode/utilities.shand seeing if any errors pop up. I've only tested this in Mac OS, but I'll try to spin up a Arch Linux VM sometime later this evening and try debugging it. Thanks for the information!

kahkhang commented 7 years ago

I just ran it in an Arch Linux VM and im facing the same issue, I'll debug it further and let you know once its fixed :)

kahkhang commented 7 years ago

I found a bug with the base64 command in arch linux, which behaves differently from OSX as it splits the output in multiple lines without passing in the wrap=0 parameter, thus malforming the stackscript json. I've pushed a fix and tried installing using a ArchLinux docker container. Could you try it again and see if its working now?

I made some notes on how to get the dependencies set up in Arch Linux, and I'll paste it here if its helpful to you :)

# Run Arch Linux in Docker
docker run --tty --interactive --rm hoverbear/archlinux /bin/bash

# Update dependencies
pacman -Syu
pacman --noconfirm -S apache jq git base-devel openssh bc

# Add new user
useradd -m -s /bin/bash USERNAME
passwd USERNAME
su - USERNAME

# Install kubectl
git clone https://aur.archlinux.org/kubectl-bin.git
cd kubectl-bin
makepkg -s

exit
#As root
pacman -U /home/USERNAME/kubectl-bin/kubectl-bin-*.pkg.tar.xz

#Run kube-linode
su - USERNAME

#To install
curl -s https://raw.githubusercontent.com/kahkhang/kube-linode/master/install.sh | bash

## Run kube-linode
~/.kube-linode/kube-linode.sh

tput cnorm
stty sane
codekoala commented 7 years ago

Fantastic! It got past the "Disk Create From StackScript" stage!

image

Thank you thank you!

This is exactly why OSX drives me nuts--it gives you the illusion that you're basically working with Linux under the hood...until you run into a situation like this. The behavior of most standard BSD commands is generally identical to their GNU counterparts, but you occasionally run into these subtle difference that break things in weird ways. /rant

Anyway, thank you for taking the time to figure this out! It looks like the script is happily moving along now!

kahkhang commented 7 years ago

Thanks @codekoala for the update! Yep I haven't been careful enough in ensuring BSD/GNU compatibility, will keep that in mind from now onwards :)