Closed rfay closed 11 months ago
I have the same issue.
docker network create test
docker run --rm -it --network test alpine
apk add --no-cache curl && curl test12345.s3.ap-northeast-1.amazonaws.com
docker network rm test
docker network create test
docker run --rm -it --network test alpine
apk add --no-cache curl && curl test1234.s3.ap-northeast-1.amazonaws.com
docker network rm test
Reproduce if fqdn is more than 41 characters and non default docker network
In my case...
open ~/.colima/default/colima.yml
edit network.driver
network:
driver: slirp
Added youtube.com to the list in OP
Added youtube.com to the list in OP
Notice that youtube.com worked for me, but www.youtube.com didn't work. 😉
Edited, thanks @renatho
If indeed using slirp
as the network driver fixes it, this should be resolved by the next release v0.5.0
.
I can add sbp-plugin-binaries.s3.eu-west-1.amazonaws.com
I would like to know if this is still the case for v0.5.0.
Thanks.
Fixed in my environment.
I've observed sporadic failures with golang.org; I'm running on a 2021 Mac M1 Silicon using the vz virtualization driver. This manifests when using the devcontainer cli to build workspace images.
$ yq '.network.driver' "$(colima template --print)"
gvproxy
$ colima version
colima version 0.5.2
git commit: 6b5b6fe0540e708f0c9d6e8919fab292c671fc72
runtime: docker
arch: aarch64
client: v23.0.1
server: v20.10.20
this is still not fixed in 0.5.4
I got bitten by this today as well and I can confirm it only happens with gvproxy network.
It appears some DNS queries fail for whatever reason.
I am still investigating.
Same here:
When I:
nslookup test.s3-website-us-east-1.amazonaws.com
Server: 192.168.107.1
Address: 192.168.107.1:53
Non-authoritative answer:
**server can't find test.s3-website-us-east-1.amazonaws.com: NXDOMAIN**
But if I use Google's 8.8.8.8:
nslookup test.s3-website-us-east-1.amazonaws.com 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8:53
Non-authoritative answer:
test.s3-website-us-east-1.amazonaws.com canonical name = s3-website.us-east-1.amazonaws.com
Non-authoritative answer:
test.s3-website-us-east-1.amazonaws.com canonical name = s3-website.us-east-1.amazonaws.com
Name: s3-website.us-east-1.amazonaws.com
Address: 52.217.87.195
Name: s3-website.us-east-1.amazonaws.com
Address: 52.216.27.3
Name: s3-website.us-east-1.amazonaws.com
Address: 52.216.98.42
Name: s3-website.us-east-1.amazonaws.com
Address: 52.216.243.67
Name: s3-website.us-east-1.amazonaws.com
Address: 52.217.140.13
Name: s3-website.us-east-1.amazonaws.com
Address: 52.216.57.53
Name: s3-website.us-east-1.amazonaws.com
Address: 52.217.10.139
Name: s3-website.us-east-1.amazonaws.com
Address: 52.217.137.173
If I change nw driver for: slirp
then now the problem is that host.docker.internal
is being resolved via /etc/hosts
but I need to be resolved via DNS Lookup:
nslookup host.docker.internal
Server: 127.0.0.11
Address: 127.0.0.11:53
** server can't find host.docker.internal: NXDOMAIN
** server can't find host.docker.internal: NXDOMAIN
If I change nw driver for: slirp then now the problem is that host.docker.internal is being resolved via /etc/hosts but I need to be resolved via DNS Lookup
@gpsa can you kindly open another issue for this? This is likely a bug.
If I change nw driver for: slirp then now the problem is that host.docker.internal is being resolved via /etc/hosts but I need to be resolved via DNS Lookup
@gpsa can you kindly open another issue for this? This is likely a bug.
I could, but just to clarify, is the slirp
driver expected to resolve host.docker.internal
via DNS Lookup?
I could, but just to clarify, is the
slirp
driver expected to resolvehost.docker.internal
via DNS Lookup?
@gpsa I suspect your issue was changing the network driver of an existing VM.
This is what I get for slirp, it uses DNS lookup as well.
nslookup host.docker.internal
Server: 192.168.5.3
Address: 192.168.5.3:53
Non-authoritative answer:
Name: host.docker.internal
Address: 192.168.5.2
Non-authoritative answer:
I could, but just to clarify, is the
slirp
driver expected to resolvehost.docker.internal
via DNS Lookup?@gpsa I suspect your issue was changing the network driver of an existing VM.
This is what I get for slirp, it uses DNS lookup as well.
nslookup host.docker.internal Server: 192.168.5.3 Address: 192.168.5.3:53 Non-authoritative answer: Name: host.docker.internal Address: 192.168.5.2 Non-authoritative answer:
@abiosoft Is there a way to recreate it without destroying everything? I could try to see if by recreating would work
@abiosoft Is there a way to recreate it without destroying everything? I could try to see if by recreating would work
@gpsa yeah. It's a regression actually, used to work before. You can edit the /etc/resolv.conf
file in the VM and set the nameserver IP to 192.168.5.3
.
In fact, it is the only entry in the file so you can simply replace it
colima ssh -- sudo sh -c 'echo "nameserver 192.168.5.3" > /etc/resolv.conf'
@abiosoft Is there a way to recreate it without destroying everything? I could try to see if by recreating would work
@gpsa yeah. It's a regression actually, used to work before. You can edit the
/etc/resolv.conf
file in the VM and set the nameserver IP to192.168.5.3
.In fact, it is the only entry in the file so you can simply replace it
colima ssh -- sudo sh -c 'echo "nameserver 192.168.5.3" > /etc/resolv.conf'
@abiosoft thank you so much, that worked like a breeze. Now both internal Docker DNS and external domains work just fine on SLIRP
.
Could the DNS issues somehow be related to Alpine?
From https://martinheinz.dev/blog/92:
Usually, you would not notice this difference, because most of the time a single UDP packet (512 bytes) is enough to resolve hostnames... until it isn't enough and your application (running on Kubernetes) that previously worked completely fine for months suddenly starts throwing "Unknown Host" exceptions for one particular (very critical) hostname. The worst part is that this can manifest randomly, anytime when some external network change causes the resolution of some particular domain to require more than the 512 bytes available in single UDP packet.
Could the DNS issues somehow be related to Alpine?
@henrik242 I have actually read something similar before but I do not think this situation is related to Alpine, considering that slirp
works fine.
As for why Alpine is the choice for Colima, you can check this comment https://github.com/abiosoft/colima/issues/291#issuecomment-1131229618.
Could the DNS issues somehow be related to Alpine?
@henrik242 I have actually read something similar before but I do not think this situation is related to Alpine, considering that
slirp
works fine.As for why Alpine is the choice for Colima, you can check this comment #291 (comment).
SLIRP
mode is now "crashing" the same way Lima alone was behaving. So, basically the mounting points stop working and:
On the Host
docker ps
Cannot connect to the Docker daemon at unix:///Users/user/.colima/default/docker.sock. Is the docker daemon running?
@gpsa you're making a bit of a mess of this issue. Could you please open one that's on-topic for your issues?
@gpsa you're making a bit of a mess of this issue. Could you please open one that's on-topic for your issues?
Sorry about that, I've then created a separated issue for the SLIRP one
when starting colima with VZ vmtype and virtiofs and providing --dns 192.168.5.3
then AWS hostname resolution seems to fail as well. without it seems to work but results in the pulling speed issues https://github.com/abiosoft/colima/issues/648 - no matter if slirp or gvproxy is used though i think for VZ vm type the network driver setting is probably ignored..
@abiosoft https://wiki.musl-libc.org/functional-differences-from-glibc.html
Multiple reports on weird musl dns incompatibility with glibc. I think it is safer to use base image like debian for this.
I would like to use Debian as well to see if it resolves this issue for us. Is that possible?
After some messing around, this seems to be the fix:
colima delete
colima start --edit
Change gvproxy
to slirp
.
With such a limitation/bug, I wonder why it's not the default.
If anyone wants to switch, the following should also possible
colima start --edit
# change value with "i" insert mode, switch to slirp
# save via ":wq:"
Or edit ~/.colima/default/colima.yaml
and re-start colima via colima stop
and colima start
.
No need for colima delete
(as far as I know).
If anyone wants to switch, the following should also possible
colima start --edit # change value with "i" insert mode, switch to slirp # save via ":wq:"
Or edit
~/.colima/default/colima.yaml
and re-start colima viacolima stop
andcolima start
.No need for
colima delete
(as far as I know).
For me, after simply restart
ing nothing seemed to be working.
To be more specific, a docker build
failed right at the beginning, because it could not even resolve registry-1.docker.io
. It was an i/o timeout
right there, suggesting all/most networking was broken in the VM.
I got the idea for the delete
from here.
Hi! I started with colima version 0.5.5
two months ago and changing the config + restart worked fine for me today (without deleting).
@rfay just mentioned in DDEV discord the following:
If you have had your colima instance through many updates, it's a worthwhile thing to delete it and recreate it. (After saving away databases of course via
ddev snapshot -a
)
So depends on how many updates happened in the meantime I guess?
Change
gvproxy
toslirp
. With such a limitation/bug, I wonder why it's not the default.
Just wondering, but colima start --network-driver slirp
should work as well, shouldn't it? It would be easier to use in a command for setup (no need to search/replace in the config file).
Though the last time I tried it, it made no difference with virtual machine type vz
, but I admit I did not delete the instance, so maybe that helps, though not sure if the network driver is even relevant for vz
but it is worth a try.
Just wondering, but
colima start --network-driver slirp
should work as well, shouldn't it? It would be easier to use in a command for setup (no need to search/replace in the config file).
Does this replace and save things in the current configuration before starting? Would be cool! (I'll try later, thanks for hint).
I hit this while running a container which does a lot of AWS service requests. DNS resolution would fail after some time when using vz
vm, then subsequent run would fail almost immediately and only colima restart
helped to get more time without DNS failures. And with qemu
and slirp
network driver it was actually even worse. So I resorted to Docker Desktop which runs without problems. Sad.
Same here, having issue while using vmType: vz and network drivers gvproxy or slirp still getting loads of error while trying to solve DNS, but I would say 50% of the requests fail.
Hi, just wondering but which lima version are you using because https://github.com/abiosoft/colima/issues/648 seems to be fixed - at least it looks like it so far - with the latest lima 0.18.x update and it was related to DNS as well, so it might also fix these issues?
I'm still getting connection refused on 127.0.0.11
I don't have a local dns server on dev machines and I can't figure out what the solution is here. How do we avoid this?
The latest version of Colima doesn't even have a driver field in the yaml file and I'm still having this problem.
I'm still getting connection refused on 127.0.0.11
I don't have a local dns server on dev machines and I can't figure out what the solution is here. How do we avoid this?
The latest version of Colima doesn't even have a driver field in the yaml file and I'm still having this problem.
same here
sadly the only workaround that works for me is to add a dns address with colima start --dns 8.8.8.8
or in the config file ~/.colima/default/colima.yaml
if the dns changes I have to restart colima colima restart
to make the dns work again
see #711
Description
I'm starting this issue so we can start to track down the specific DNS addresses that fail in colima/lima, and the sources of information. I get this question all the time, and tell people to use
--dns 1.1.1.1
and it almost always fixes. But I think we should start to track what they are so maybe we can solve this someday.Version
Colima Version: Various Lima Version: Qemu Version:
Operating System
Workarounds
Many people have reported in the comments that changing to the slirp network driver resolved the issue.