Open amartin120 opened 2 years ago
Hopefully this is not polluting the wrong issue, however: we (as in I and some in my team) see the same having k8s disabled and using docker compose on Windows 11. Restarting affected containers eventually (so far never more than 2 restarts) resolves the problem.
This is also an issue on WSL with rancher desktop 1.6+ 80 and 443 are no longer bound by default.
Adding my comments from slack here:
I just want to confirm before I open a GH issue. It does appear that in 1.6.x on windows, port 80 and 443 no longer make it to Traefik. Is this expected?
Another test:
I previously thought this was related to the windows 11 22H2 upgrade, but verified this issue on a different machine without the upgrade
Here is my kubectl get event
14m Normal NodeHasNoDiskPressure node/int208 Node int208 status is now: NodeHasNoDiskPressure 14m Normal NodeHasSufficientPID node/int208 Node int208 status is now: NodeHasSufficientPID 14m Normal NodeNotReady node/int208 Node int208 status is now: NodeNotReady 14m Normal RegisteredNode node/int208 Node int208 event: Registered Node int208 in Controller 13m Normal NodeAllocatableEnforced node/int208 Updated Node Allocatable limit across pods 13m Normal NodeReady node/int208 Node int208 status is now: NodeReady 4m52s Normal Starting node/int208 4m52s Warning listen tcp4 :32535: bind: address already in use node/int208 can't open port "nodePort for kube-system/traefik:web" (:32535/tcp4), skipping it 4m52s Warning listen tcp4 :30161: bind: address already in use node/int208 can't open port "nodePort for kube-system/traefik:websecure" (:30161/tcp4), skipping it 4m48s Normal Starting node/int208 Starting kubelet. 4m48s Warning InvalidDiskCapacity node/int208 invalid capacity 0 on image filesystem 4m48s Normal NodeAllocatableEnforced node/int208 Updated Node Allocatable limit across pods 4m48s Normal NodeHasSufficientMemory node/int208 Node int208 status is now: NodeHasSufficientMemory 4m48s Normal NodeHasNoDiskPressure node/int208 Node int208 status is now: NodeHasNoDiskPressure 4m48s Normal NodeHasSufficientPID node/int208 Node int208 status is now: NodeHasSufficientPID 4m48s Warning Rebooted node/int208 Node int208 has been rebooted, boot id: d0b003fe-a03a-4130-b39f-e48af6217785 4m42s Normal RegisteredNode node/int208 Node int208 event: Registered Node int208 in Controller
Still not working with RD 1.7.0. Staying on 1.5.x continues to work just fine.
Could this be connected to anything in https://github.com/rancher-sandbox/rancher-desktop-host-resolver?
I think that I've resolved this for myself at least. My local network was forwarding ports 80 and 443 to my local IP (i.e my en0
) and this was perfectly fine for when I was running RD 1.5.x and lower. However when I'm on RD 1.6 and 1.7 , if I adjust my port forwarding rules to route 80 and 443 to the lima-0
IP, I'm back up and running.
Can you by chance provide a quick example of that?
(I'm guessing kubectl port-forward --address lima-0 --namespace kube-system service/traefik 443:443
)
For me it was all about my local network. I'm using AWS route53 for my testing domain and that has an A record to my external facing IP. My local network router had port forwarding rules for incoming (80/443) traffic to route to my MacBook local IP (i.e. 192.168.1.x) that I'm running RD on. Despite being slightly overkill for a typical local test environment, the version of lima used in RD 1.5 and earlier was able resolve that just fine within the VM.
However starting with RD 1.6, the above setup stopped resolving within the lima VM. So what I ended up doing is I adjusted my actual network router port forwarding rules to send (80/443) traffic straight to the lima-0 VM IP. (which is probably what your Traefik service LoadBalancer
in your cluster is mapped to)
Does that help?
Actual Behavior
I'll preface by stating that I have Traefik disabled in favor of Istio Ingress for my Rancher Desktop k8s setup because of the certain types of testing that I do. In RD versions 1.5.x and earlier, I have been able to access my Kubernetes applications without issue using wildcard domain names that I get via AWS Route53 and certs issued from Cert Manager. However something has changed starting with RD version 1.6 and later and I can no longer access my applications.
Steps to Reproduce
I'll use the fake domain of "rancher.mydomain.com" for the sake of this issue.
In RD 1.5.x and earlier, I can curl https://rancher.mydomain.com and my Istio Ingress will successfully route to my app and display the correct results. The web browser also reflects the same successful results.
In RD 1.6.x, when I curl the same as the above locally, I get
and from a lima shell
Result
Unable to connect to my applications like I could in 1.5.x and earlier.
Expected Behavior
Things worked the same as RD 1.5.x and earlier.
Additional Information
The only solution that I have currently is to downgrade RD to 1.5.x and factory reset, everything starts working fine again. No application/ingress config is changed when upgrading or downgrading.
Rancher Desktop Version
1.6.x
Rancher Desktop K8s Version
1.23.13
Which container engine are you using?
containerd (nerdctl)
What operating system are you using?
macOS
Operating System / Build Version
Monterey 12.6.1 and Ventura 13.0
What CPU architecture are you using?
arm64 (Apple Silicon)
Linux only: what package format did you use to install Rancher Desktop?
No response
Windows User Only
No response