jpetazzo / ampernetacle

2.66k stars 447 forks source link

Failed to connect to : Connection refused #43

Closed humbertocrispim closed 1 year ago

humbertocrispim commented 2 years ago

After creating the VMS in the OCI, this loop appears and does not go to the next step

null_resource.wait_for_kube_apiserver (local-exec): curl: (7) Failed to connect to 132.226.252.238 port 6443 after 78 ms: Connection refused

image

antonsatskyi commented 1 year ago

same here

RMorgado commented 1 year ago

Same here :(

marlonleite commented 1 year ago

same here :(

andreixhz commented 1 year ago

same here :(

weskhel commented 1 year ago

same here :(

JandersonFB commented 1 year ago

same here :'(

jpetazzo commented 1 year ago

Hi! Please try again with the latest version (make sure to git pull!) as I just pushed a bunch of updates that should improve the process a lot.

Also, now Terraform will show the cloud-init progress; I hope it will give a better idea about what's going on, especially in case of problems!

JandersonFB commented 1 year ago

All nodes started: image

but, weave nets Crash: image image

RMorgado commented 1 year ago

I'm using the latest version and I have this error: Captura de ecrã 2022-12-23, às 15 44 38

jpetazzo commented 1 year ago

@JandersonFB Sorry about that - I got the wrong Weave YAML URL. I was using https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s.yaml instead of https://github.com/weaveworks/weave/releases/download/v2.8.1/weave-daemonset-k8s-1.11.yaml. I've fixed the Terraform configuration and hopefully it should work now. Make sure you have the latest commits and after recreating the VMs it should work!

jpetazzo commented 1 year ago

@RMorgado ah that's an interesting error. Do you get the same error for all 4 instances or just the first one? I wonder if that could be a transient error; i.e. if you try again after a while it will work? (Perhaps delete the VM and recreate it with Terraform; or use terraform taint? Let me know if you need help for that!)

RMorgado commented 1 year ago

Yes @jpetazzo , I have the error in all instances. I've tested it several times and on different days. I entered an instance and ran the command "kubectl get nodes " and I got this error: error_cloud

I can send more information if you need it. Thank you for your time and patience

hebertviana commented 1 year ago

Hello, first thanks for the work @jpetazzo, I'm having the same problem as @RMorgado.

I'm trying to identify the reason, at night I'll perform new tests if I get a solution I'll update here.

hebertviana commented 1 year ago

@jpetazzo and @RMorgado,

To solve the problem, I upgraded the provider version in provider.tf or oracle/oci to the latest version "4.102.0" (only that didn't work for me, I was in version 0.14.8 of terraform), I did the upgrade from terraform to v1.3.6 and successfully run terraform.

Apply complete! Resources: 5 added, 1 changed, 4 destroyed.
RMorgado commented 1 year ago

Thank you for your help @hebertviana !!! I made the changes you said and updated terraform to the latest version and it worked!!

But I still get this error, I'll have to investigate to see if it's firewall issues erro_kubectl_get_nodes

If I enter one of the nodes everything is ok!

hebertviana commented 1 year ago

@RMorgado,

Uhuu, glad it worked.

see if I managed to run this script to release the ports on the firewall, from what I saw here it is in /etc/cloud init/scripts

1-allow-inbound-traffic.sh

PauloBigooD commented 1 year ago

@hebertviana Could you explain better how this configuration of the /etc/cloud init/scripts script was performed

1-allow-inbound-traffic.sh

I did a search here locally but didn't find anything related

Here to solve the problem I commented the following lines of the main.tf file

provisioner "remote-exec" { inline = [ "tail -f /var/log/cloud-init-output.log &", "cloud-init status --wait >/dev/null", ] }

But I know this is just a palliative solution

jpetazzo commented 1 year ago

Hi everyone! It looks like an extra firewalling rule had been added to the Oracle images. I removed that rule (in commit 0a82500) and it looks like it solved it. Let me know if it works for you!

alessonviana commented 5 months ago

Hey I updated my fork, and I compared the files, and seems like the same. but I'm still getting the same error image

OmarStewey commented 1 month ago

Hey I updated my fork, and I compared the files, and seems like the same. but I'm still getting the same error image

Did you ever find a fix for this? I'm experiencing the same thing.