Closed Paraphraser closed 2 years ago
Surprisingly, this didn't fix the build for me:
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/main/armhf/APKINDEX.tar.gz
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.14/main: temporary error (try again later)
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.14/main: No such file or directory
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/community/armhf/APKINDEX.tar.gz
2 errors; 20 distinct packages available
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.14/community: temporary error (try again later)
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.14/community: No such file or directory
I don't get it. I'm able to access the URL with a browser.
I don't get it either.
Building mosquitto
Sending build context to Docker daemon 9.216kB
Step 1/10 : FROM eclipse-mosquitto:latest
latest: Pulling from library/eclipse-mosquitto
Digest: sha256:ce08d3fe69d4170cea2426739af86ac95e683f01dd2c4141da661983a2401364
Status: Image is up to date for eclipse-mosquitto:latest
---> 24a85c54a50e
Step 2/10 : RUN sed -i 's/https/http/' /etc/apk/repositories
---> Running in 622915d96dfa
Removing intermediate container 622915d96dfa
---> c332383971ae
Step 3/10 : RUN apk update && apk add --no-cache rsync tzdata
---> Running in d7f0eebee992
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/main/armhf/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/community/armhf/APKINDEX.tar.gz
v3.14.2-52-g922bf2dee7 [http://dl-cdn.alpinelinux.org/alpine/v3.14/main]
v3.14.2-58-gdbf551a5fb [http://dl-cdn.alpinelinux.org/alpine/v3.14/community]
OK: 13102 distinct packages available
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/main/armhf/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/community/armhf/APKINDEX.tar.gz
(1/5) Installing libacl (2.2.53-r0)
(2/5) Installing popt (1.18-r0)
(3/5) Installing zstd-libs (1.4.9-r1)
(4/5) Installing rsync (3.2.3-r4)
(5/5) Installing tzdata (2021a-r0)
Executing busybox-1.33.1-r3.trigger
OK: 14 MiB in 25 packages
Removing intermediate container d7f0eebee992
---> b5e2d0baab6c
Step 4/10 : ENV IOTSTACK_DEFAULTS_DIR="iotstack_defaults"
---> Running in 738992f2e418
Removing intermediate container 738992f2e418
---> e3d1202fdb2c
Step 5/10 : COPY --chown=mosquitto:mosquitto ${IOTSTACK_DEFAULTS_DIR} /${IOTSTACK_DEFAULTS_DIR}
---> 14469f815067
Step 6/10 : ENV IOTSTACK_ENTRY_POINT="docker-entrypoint.sh"
---> Running in a94f90fbcc9e
Removing intermediate container a94f90fbcc9e
---> 9bfd1bac133c
Step 7/10 : COPY ${IOTSTACK_ENTRY_POINT} /${IOTSTACK_ENTRY_POINT}
---> 6a19f7be27bf
Step 8/10 : RUN chmod 755 /${IOTSTACK_ENTRY_POINT}
---> Running in ad355035cec0
Removing intermediate container ad355035cec0
---> fe01c40bca83
Step 9/10 : ENV IOTSTACK_ENTRY_POINT=
---> Running in 23e63c8aec3d
Removing intermediate container 23e63c8aec3d
---> 6331aff9e195
Step 10/10 : VOLUME ["/mosquitto/config", "/mosquitto/pwfile"]
---> Running in 9333dc802fc7
Removing intermediate container 9333dc802fc7
---> 172b8de07052
Successfully built 172b8de07052
Successfully tagged iotstack_mosquitto:latest
I tried it a few times but I'm totally out of ideas. I had to downgrade to eclipse-mosquitto:2.0.11.
This problem is downright annoying! Annoying for people like you at their wits' end. Annoying for people like me who are trying to figure out why people like you are having trouble that I can't replicate. 🤬🤬🤬
After a bit more Googling, I found these:
To summarise what I get from both of those:
In effect, I'm already implementing both of those suggestions so either or both could explain our different experiences. Here's the background:
I have a local upstream DNS (BIND9 running on a macOS box). Every Pi points to the Mac. The Mac is authoritative for my local domain. Anything the Mac can't answer, is handled like this:
forwarders {8.8.8.8; 8.8.4.4; 1.1.1.1; };
That means off-net queries are being directed to those servers in round-robin fashion.
I'd be curious to know whether you have IPv6 enabled and how your DNS is set up?
To clobber IPv6:
Use sudo
and your favourite Unix text editor to open:
/etc/sysctl.conf
Find the line:
#net.ipv6.conf.all.forwarding=1
After that line, insert these lines:
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
Save the file and reboot. Some guides suggest sudo sysctl -p
but the problem with that is if your SSH connection was established over IPv6 - it pulls the rug from under your current session (which hangs) and it can be a pain to convince your local host that it needs to try via IPv4. A reboot is much cleaner.
How you convince Docker to use 8.8.8.8 will depend on your arrangements:
The suggestion in the two URLs above, augmented with my own recommendations:
Take your stack down.
Use sudo
and your favourite Unix text editor to create:
/etc/docker/daemon.json
Add the text:
{
"dns":["8.8.8.8"]
}
Save the file and reboot.
Bring the stack up.
I've added the down/reboot/up because experimentation on my own system suggests Docker can get into a mess if you try to convince it to do any of this dynamically (eg with systemctl
commands).
The above daemon.json
patch works on my test Pi in the sense of the containers coming up again without Docker doing any serious dummy-spits. I think it is also working, technically, in that if I go into my Node-RED container (where I have DNS tools installed) and try a dig
to a local domain name (one that will be resolved by the Mac but is unknown to 8.8.8.8), there's a change in behaviour:
dig | unpatched Pi | patched Pi |
---|---|---|
@local DNS | succeeds | succeeds |
@8.8.8.8 | fails | fails |
undirected | succeeds | fails |
That pattern implies that the undirected queries are being sent to 8.8.8.8 on the patched Pi, and the daemon.json
patch is the only possible explanation for the difference.
Running this test from Node-RED also makes the point that the daemon.json
patch doesn't just affect Mosquitto Dockerfile builds but all of Docker. I couldn't find any way of passing a "use 8.8.8.8 for this Mosquitto build". So, even if this does solve your problem, you might need to get into the game of turning it on and off when you want to build Alpine containers with local Dockerfiles.
And then, of course, we're still left with the other hint in Alpine issue 98 that having a proxy between you and the Alpine repos could be part of the problem.
I'm still scratching my head on all of this. I hope this helps you make some progress.
Don't worry. I'm happy to use the 2.0.11 for now and hope it gets fixed for the next version. The impression I got from the conversation is that there's a bug in libfetch or some other library that makes HTTPS to fail - maybe IPv6 too. But for the sake of understanding what's going on, I tried what you suggested.
I have DNS set up using DHCP. It creates resolv.conf
automatically, with the name servers of my ISP. I don't have a private DNS server. I don't understand why using 8.8.8.8 would make any difference, but I tried creating the /etc/docker/daemon.json
. It didn't help.
I'm using IPv6. I tried disabling it with your instructions. I'm not sure if that really disabled IPv6. This command indicates that it did:
$ sysctl -a 2>/dev/null | grep disable_ipv6
net.ipv6.conf.all.disable_ipv6 = 1
...
But this command seems to indicate that it didn't:
$ cat /sys/module/ipv6/parameters/disable
0
Anyway, it seems that this somehow messes up the network configuration - there's a noticeable lag in my connection to the Pi when I do this. And it didn't fix the problem with Alpine image.
Don't worry. I'm happy to use the 2.0.11
However, if you do notice it getting fixed, I'd appreciate it if you'd add a comment to this post, please.
I don't understand why using 8.8.8.8 would make any difference
In principle, I agree. It shouldn't.
I think the issue with 8.8.8.8 vs ISP-supplied is the theory that ISPs have been known to intercept DNS traffic so they can try to horn-in on Google's act of selling and supplying tailored ads. Some of the reading I've done (eg The Wire and Ars Technica) suggests that those interception systems can get in the way.
A secondary issue can occur from the propagation delays that are inherent in the "distributed database" nature of the DNS. As an example, I showed you my "forwarders" rule but I also have told BIND to send queries for special domains straight to the servers that are authoritative for those domains. Here's an example:
zone "hopto.org" in {
type forward;
forwarders {194.62.182.53; 45.54.64.53; 204.16.253.53; 194.62.183.53; };
};
I'm signed-up with No-IP.com for dynamic DNS. My domain name with them is a sub-domain of hop-to.org
. The servers in that list will be the first to find out when my ISP changes the IP address of my router's WAN interface so that's the best place to look for authoritative information.
In general, the big, beefy DNS hosts like 8.8.8.8 and friends tend to discover updates more quickly than the "PC under the desk" of a small corner-store ISP. Presumably, most ISPs are somewhere between those extremes.
Nevertheless, I think this only really matters for things that change a lot like dynamically-allocated router WAN interfaces. I doubt that it would be a concern in the situation we're talking about.
Still, it has always seemed to me that the error is coming from the target host rather than being a DNS problem, per se. It was simply other people saying "use 8.8.8.8" and me thinking "I'm already using 8.8.8.8" that made me wonder if it was somehow relevant.
Querying the A record for the service returns four addresses:
$ dig +short dl-cdn.alpinelinux.org
dualstack.d.sni.global.fastly.net.
151.101.194.133
151.101.66.133
151.101.2.133
151.101.130.133
If I try iterating those IP addresses directly with wget
then the other end complains with "500 Domain Not Found", which I take to mean there's a reverse proxy in the middle matching on some/all of "dl-cdn.alpinelinux.org".
I mostly do these drill-downs into problems other people are having because I always wind up learning a lot. But there comes a point where I say "can of worms", shrug my shoulders and move right on by. I think I've reached that point with this.
I tried disabling it with your instructions. I'm not sure if that really disabled IPv6
Hmmm...
A test on my running system. First, show that the mechanism I use for disabling IPv6 is in place:
$ grep "^net.ipv6" /etc/sysctl.conf
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
Second, show that I get the same result as you do on your test:
$ cat /sys/module/ipv6/parameters/disable
0
Third, ask the Ethernet and WiFi interfaces what they think. Note the presence of inet
in the second line of output from each interface (IPv4) but the absence of any line starting with inet6
.
$ ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.203.102 netmask 255.255.255.0 broadcast 192.168.203.255
ether dc:a6:32:4a:89:f9 txqueuelen 1000 (Ethernet)
RX packets 3615775 bytes 1184361993 (1.1 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1722556 bytes 505948055 (482.5 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
$ ifconfig wlan0
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.203.103 netmask 255.255.255.0 broadcast 192.168.203.255
ether dc:a6:32:4a:89:fa txqueuelen 1000 (Ethernet)
RX packets 553810 bytes 94741369 (90.3 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 942 bytes 476679 (465.5 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Now, remove the relevant lines from /etc/sysctl.conf
, reboot, and run the same tests:
$ grep "^net.ipv6" /etc/sysctl.conf
$ cat /sys/module/ipv6/parameters/disable
0
$ ifconfig eth0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.203.102 netmask 255.255.255.0 broadcast 192.168.203.255
inet6 fe80::c4a4:4313:20d:aa6d prefixlen 64 scopeid 0x20<link>
ether dc:a6:32:4a:89:f9 txqueuelen 1000 (Ethernet)
RX packets 1852 bytes 558398 (545.3 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1513 bytes 178673 (174.4 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
$ ifconfig wlan0
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.203.103 netmask 255.255.255.0 broadcast 192.168.203.255
inet6 fe80::e7c:3f51:3e1b:3c29 prefixlen 64 scopeid 0x20<link>
ether dc:a6:32:4a:89:fa txqueuelen 1000 (Ethernet)
RX packets 416 bytes 73691 (71.9 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 206 bytes 37726 (36.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
With the patch removed, the interfaces have gained inet6
addresses, showing IPv6 is configured. I honestly don't know what /sys/module/ipv6/parameters/disable
is meant to show, or the circumstances when it would be expected to return a value other than zero, but it doesn't appear to reflect the actual state of the interfaces so, if it were me, I wouldn't be relying on it.
there's a noticeable lag in my connection to the Pi when I do this
I don't understand why IPv4 should be any slower than IPv6.
A test:
mac $ for N in 1 2 3 4 5 ; do for H in iot-hub sec-dev new-dev octopi ; do echo -n "$(date) $H " ; ssh $H date ; done ; done
Mon Oct 4 11:34:58 AEDT 2021 iot-hub Mon 04 Oct 2021 11:34:58 AM AEDT
Mon Oct 4 11:34:58 AEDT 2021 sec-dev Mon 04 Oct 2021 11:34:58 AM AEDT
Mon Oct 4 11:34:58 AEDT 2021 new-dev Mon 04 Oct 2021 11:34:59 AM AEDT
Mon Oct 4 11:34:59 AEDT 2021 octopi Mon 04 Oct 2021 11:35:00 AM AEDT
Mon Oct 4 11:35:00 AEDT 2021 iot-hub Mon 04 Oct 2021 11:35:00 AM AEDT
Mon Oct 4 11:35:00 AEDT 2021 sec-dev Mon 04 Oct 2021 11:35:00 AM AEDT
Mon Oct 4 11:35:00 AEDT 2021 new-dev Mon 04 Oct 2021 11:35:00 AM AEDT
Mon Oct 4 11:35:00 AEDT 2021 octopi Mon 04 Oct 2021 11:35:01 AM AEDT
Mon Oct 4 11:35:01 AEDT 2021 iot-hub Mon 04 Oct 2021 11:35:01 AM AEDT
Mon Oct 4 11:35:01 AEDT 2021 sec-dev Mon 04 Oct 2021 11:35:01 AM AEDT
Mon Oct 4 11:35:01 AEDT 2021 new-dev Mon 04 Oct 2021 11:35:02 AM AEDT
Mon Oct 4 11:35:02 AEDT 2021 octopi Mon 04 Oct 2021 11:35:02 AM AEDT
Mon Oct 4 11:35:02 AEDT 2021 iot-hub Mon 04 Oct 2021 11:35:02 AM AEDT
Mon Oct 4 11:35:02 AEDT 2021 sec-dev Mon 04 Oct 2021 11:35:03 AM AEDT
Mon Oct 4 11:35:03 AEDT 2021 new-dev Mon 04 Oct 2021 11:35:03 AM AEDT
Mon Oct 4 11:35:03 AEDT 2021 octopi Mon 04 Oct 2021 11:35:03 AM AEDT
Mon Oct 4 11:35:03 AEDT 2021 iot-hub Mon 04 Oct 2021 11:35:04 AM AEDT
Mon Oct 4 11:35:04 AEDT 2021 sec-dev Mon 04 Oct 2021 11:35:04 AM AEDT
Mon Oct 4 11:35:04 AEDT 2021 new-dev Mon 04 Oct 2021 11:35:04 AM AEDT
Mon Oct 4 11:35:04 AEDT 2021 octopi Mon 04 Oct 2021 11:35:05 AM AEDT
mac $
In short: five iterations of probing a bunch of Raspberry Pis in sequence.
The host running the test is a Mac.
The hosts "iot-hub", "sec-dev", "new-dev" and "octopi" are Raspberry Pis with IPv6 disabled:
The comms path to the Raspberry Pi 4s is Mac Ethernet to gigabit Ethernet switch to Pi. For the 3B+, there are two extra hops via a WiFi access point.
Running ssh
with a command argument does the equivalent of connect, login, run the command, logout, disconnect. Each target host name (eg "iot-hub") matches an entry in ~/.ssh/config
which maps like this:
host iot-hub
hostname iot-hub.my.domain.com
Thus ssh iot-hub
will imply a trip through the resolver but I think we can safely assume that all those DNS names were cached in the local resolver and didn't result in DNS queries going to my local DNS server.
All devices (Mac and Pis) are synching with the same NTP server so they should have the same time to within a millisecond or so.
The "date" command on the Pis does have the ability to show finer resolution than one second (
%N
) but macOS lacks that ability for some reason known only to Apple. Rather than try to find a way around that deficiency, I just went with what I had.
Making all due allowances for when each command is issued within the span of the "current second", I read the results as saying that the Pi4s are capable of a round-trip response in under ⅓ of a second (the 11:35:00 and 11:35:01 batches in particular) while the 3B+ does seem to take a little bit longer. Whether that's down to WiFi, the slower 3B+ or the SD card is unknown.
I don't call the implied wet-finger estimate of "about ½ of ⅓ of a second" to just do a routine "connect and login" particularly slow but, then, I don't have your experience with IPv6 so maybe it really is considerably more snappy.
Conversely, if disabling IPv6 results in you going from sub-second to significantly longer times then the test results above will show that IPv4 really can hold its own and that you might want to look for another cause.
However, if you do notice it getting fixed, I'd appreciate it if you'd add a comment to this post, please.
Sure, and at least I can repeat the problem, so I if you come up with a solution, I can try it.
I think the issue with 8.8.8.8 vs ISP-supplied is the theory that ISPs have been known to intercept DNS traffic so they can try to horn-in on Google's act of selling and supplying tailored ads. Some of the reading I've done (eg The Wire and Ars Technica) suggests that those interception systems can get in the way.
Thanks. I can now understand the logic behind this.
Querying the A record for the service returns four addresses:
$ dig +short dl-cdn.alpinelinux.org dualstack.d.sni.global.fastly.net. 151.101.194.133 151.101.66.133 151.101.2.133 151.101.130.133
If I try iterating those IP addresses directly with
wget
then the other end complains with "500 Domain Not Found", which I take to mean there's a reverse proxy in the middle matching on some/all of "dl-cdn.alpinelinux.org".
Indeed sounds like the server is behind a reverse proxy. However, oddly enough I see only one A record, and it's none of those that you listed:
$ dig dl-cdn.alpinelinux.org
; <<>> DiG 9.16.1-Ubuntu <<>> dl-cdn.alpinelinux.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40882
;; flags: qr rd ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;dl-cdn.alpinelinux.org. IN A
;; ANSWER SECTION:
dl-cdn.alpinelinux.org. 0 IN CNAME dualstack.d.sni.global.fastly.net.
dualstack.d.sni.global.fastly.net. 0 IN A 151.101.86.133
;; Query time: 0 msec
;; SERVER: 172.30.224.1#53(172.30.224.1)
;; WHEN: Mon Oct 04 20:45:49 EEST 2021
;; MSG SIZE rcvd: 158
Third, ask the Ethernet and WiFi interfaces what they think. Note the presence of
inet
in the second line of output from each interface (IPv4) but the absence of any line starting withinet6
.
I verified with ifconfig that I'm able to disable IPv6. This sounded like a reasonable explanation, but no help from disabling IPv6.
I don't understand why IPv4 should be any slower than IPv6.
It's definitely not that IPv4 is so slow. I think I broke something, since I was seeing random delays of up to a few seconds after every key press. Anyway, since disabling IPv6 didn't help anything, I'll just keep it enabled.
Another idea that I had is that I could debug this by running commands in the Dockerfile. There's no dig in the Docker image, so I tried nslookup. I got this error:
nslookup: clock_gettime(MONOTONIC) failed
If I run wget http://dl-cdn.alpinelinux.org/alpine/v3.14/community/armhf/APKINDEX.tar.gz
, I get:
wget: bad address 'dl-cdn.alpinelinux.org'
Actually it seems that I'm not able to access anything with wget from the build script.
The first error message led me to a couple of threads:
https://github.com/haugene/docker-transmission-openvpn/issues/1726 https://github.com/haugene/docker-transmission-openvpn/issues/1564
There also people suggest setting the DNS server manually. I tried several things to no avail, including setting the DNS server in resolv.conf and various settings in docker-compose.yml:
dns:
- 8.8.8.8
cap_add:
- NET_ADMIN
sysctls:
- net.ipv6.conf.all.disable_ipv6=0
I think I'll get too confused if I try to provide comments and suggestions via both your IOTstack and Alpine posts so I'll stick with IOTstack.
Also, I've been travelling today so half of this has been written on the road and is a bit scatter-brained as a result. Sorry.
There are two lines in your dig
output that trouble me:
$ dig dl-cdn.alpinelinux.org
…
;; WARNING: recursion requested but not available
…
;; SERVER: 172.30.224.1#53(172.30.224.1)
I'll deal with the SERVER
line first. I don't know your network setup so I might be wrong about this but seeing 172.30.224.1 makes me itch. It suggests (a long way short of proof) one of two things:
Assuming this does indicate queries being directed to a containerised DNS, I'll talk about it.
I don't do that myself. I know opinions vary but I have really deep misgivings about wisdom of a Pi (or any computer) nominating itself for DNS resolution when the DNS is running in a container.
Experience has taught me that that last sentence is often misinterpreted so I'll spell it out. I'm talking about a very specific combination of elements:
/etc/resolv.conf
for that host is using its own IP address (or the loopback address, or some other "get me to the container" mechanism) for DNS.A lot of wacky stuff goes on with the DNS in container-space and people who behave as though the DNS is not a special case seem somewhat over-represented in Discord questions and GitHub issues.
Assuming I've guessed correctly about what 172.30.224.1 represents, I don't know whether we're talking PiHole, AdGuardHome or BIND running in a container. For the moment, I'm going to assume it is PiHole but the same comments apply to any ad-blocker.
I look at it like this. Does the Pi itself need ad-blocking services?
Unless you're also trying to use the Pi as a general-purpose computer, I reckon the answer is usually "no". Ads only bother human eyeballs. You typically want ad-blocking on the devices where you routinely fire up browsers. It's those other devices that should point to the Pi for their DNS, and then Docker can NAT-forward the traffic to the ad-blocker. That works and works well.
There is also no harm in other devices using something like PiHole for general DNS services such as being authoritative for a local domain.
But – if you want the Pi to use itself for any DNS services (ad-blocking and/or general DNS), then it really shouldn't run in a container. It really should be a native install. That guarantees that (a) the service comes up early (a Docker container arrives pretty late in the scheme of things), (b) there's no NAT nonsense getting to/from container-space, (c) no backdoor routes between containers to worry about, and (d) no Docker hocus-pocus with 127.0.0.11.
Just one person's opinion, of course.
\</soapbox>
Moving on.
The WARNING
line message looks like trouble. There's an explanation that might help you understand the context. In general, the first DNS server your query reaches should always support recursion. Servers further away need not support recursion but might. I've always taken the view that if the reason a server isn't supporting recursive queries isn't obvious then it's probably a mistake.
I can't explain why you are getting that warning. I can't explain what it actually means in your environment. I can't say for certain if it is the root cause of your problem. All I can say is that that message, and the fact that you only get one A record back, makes me itch.
For the sake of comparison, though, here's the result of me running the same query, three times, on my Raspberry Pi which is running PiHole in a Docker container. The first query:
/etc/resolv.conf
meaning that the query will be answered by my local upstream DNS (BIND running on macOS). BIND will forward it to 8.8.8.8 and friends. Again, that's what I've told BIND to do with any queries it can't answer.$ dig @$(hostname -I | awk '{print $1;}') dl-cdn.alpinelinux.org
; <<>> DiG 9.11.5-P4-5.1+deb10u5-Raspbian <<>> @192.168.203.60 dl-cdn.alpinelinux.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52741
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;dl-cdn.alpinelinux.org. IN A
;; ANSWER SECTION:
dl-cdn.alpinelinux.org. 3600 IN CNAME dualstack.d.sni.global.fastly.net.
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.2.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.66.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.130.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.194.133
;; Query time: 645 msec
;; SERVER: 192.168.203.60#53(192.168.203.60)
;; WHEN: Wed Oct 06 16:30:47 AEDT 2021
;; MSG SIZE rcvd: 162
$ dig @$(eval echo $(docker network inspect iotstack_default | jq .[0].IPAM.Config[0].Gateway)) dl-cdn.alpinelinux.org
; <<>> DiG 9.11.5-P4-5.1+deb10u5-Raspbian <<>> @172.18.0.1 dl-cdn.alpinelinux.org
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31017
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;dl-cdn.alpinelinux.org. IN A
;; ANSWER SECTION:
dl-cdn.alpinelinux.org. 3600 IN CNAME dualstack.d.sni.global.fastly.net.
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.2.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.66.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.130.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.194.133
;; Query time: 755 msec
;; SERVER: 172.18.0.1#53(172.18.0.1)
;; WHEN: Thu Oct 07 00:07:00 AEDT 2021
;; MSG SIZE rcvd: 162
$ dig dl-cdn.alpinelinux.org
; <<>> DiG 9.11.5-P4-5.1+deb10u5-Raspbian <<>> dl-cdn.alpinelinux.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 25774
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: a07567ab8eb7d6e001000000615d34b31fb15f7847ccce56 (good)
;; QUESTION SECTION:
;dl-cdn.alpinelinux.org. IN A
;; ANSWER SECTION:
dl-cdn.alpinelinux.org. 2561 IN CNAME dualstack.d.sni.global.fastly.net.
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.194.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.130.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.2.133
dualstack.d.sni.global.fastly.net. 30 IN A 151.101.66.133
;; Query time: 98 msec
;; SERVER: 192.168.203.65#53(192.168.203.65)
;; WHEN: Wed Oct 06 16:31:31 AEDT 2021
;; MSG SIZE rcvd: 190
The SERVER
line is the only difference: 192.168.203.60 (the Pi running PiHole), 172.18.0.1 (the router on the internal bridged network) and 192.168.203.65 (the Mac running BIND). No warnings, and I get four A records back each time.
You, on the other hand, are getting a warning and one A record...
I repeated your experiment adding RUN nslookup www.google.com
to the Dockerfile, mainly to show you that I don't think it does what I suspect you think it does.
A true Docker guru (ie not me) would probably use different language to explain this but, as far as I can tell, when you are building a container with a Dockerfile, I think it is closer to a sandbox or "host mode" than actually having a full operating system running in a container. The network services you are getting are those of the Pi, not those of the container.
Notice the "192.168.203.65" 8 lines from the top. That's my local upstream DNS so the path being followed is that of /etc/resolv.conf:
$ BUILD mosquitto
Building mosquitto
Sending build context to Docker daemon 12.29kB
Step 1/14 : FROM eclipse-mosquitto:latest
---> 24a85c54a50e
Step 2/14 : RUN nslookup www.google.com
---> Running in 393ed80929f1
Server: 192.168.203.65
Address: 192.168.203.65:53
Non-authoritative answer:
Name: www.google.com
Address: 142.250.71.68
Non-authoritative answer:
Name: www.google.com
Address: 2404:6800:4006:812::2004
…[snip]…
If I repeat the command inside the running container, the server is the special 127.0.0.11 which is Docker's internal DNS handler:
$ docker exec mosquitto nslookup www.google.com
Server: 127.0.0.11
Address: 127.0.0.11:53
Non-authoritative answer:
Name: www.google.com
Address: 172.217.167.100
Non-authoritative answer:
Name: www.google.com
Address: 2404:6800:4006:812::2004
Ultimately, queries to 127.0.0.11 are going to follow the Pi's /etc/resolv.conf
and wind up at 192.168.203.65 (and be forwarded to 8.8.8.8) so it's still the same thing but they appear to be answered by 127.0.0.11.
I think of it like this. If 127.0.0.11 is the apparent server for undirected queries then it implies container services are available. If it's something else then container services aren't available. Logically, the only services that are available are those of the Pi.
This line in your output, however, is another thing that makes me feel quite itchy:
nslookup: clock_gettime(MONOTONIC) failed
These days, any mention of "time" or "clock", particularly if Alpine is involved, sends me straight to Issue 401. Have you installed both recommended system patches? If not, the libseccomp
patch might be what you are looking for.
I was also asking myself what else might be different between our systems that might explain what's going on, and another thought occurred to me. A couple of references:
Getting rid of the obsolete version of docker-compose
installed by the IOTstack menu or automatic script (both of which use apt
) in favour of a later version installed by pip
which is actually being maintained, might help. If you've done that already, great.
The other thing I've done (so long ago that I'd almost forgotten about it) is to add this line to my .profile
:
export COMPOSE_DOCKER_CLI_BUILD=1
See buildkit support for more info.
Again, I have no idea whether this will help or make no difference. I'm just canvassing possible explanations for why Mosquitto builds reliably for me but not for you.
The first "docker-transmission" URL triggers my same "this might be libseccomp2
" reflex.
For the record, the relevant lines in my /etc/resolvconf.conf
look like this:
name_servers=192.168.132.65
search_domains=my.domain.com
and, accordingly, my /etc/resolv.conf
like this:
# Generated by resolvconf
search my.domain.com
nameserver 192.168.132.65
In other words, send all queries to BIND running on the Mac.
My PiHole service definition is:
pihole:
container_name: pihole
image: pihole/pihole:latest
restart: unless-stopped
environment:
# https://github.com/pi-hole/docker-pi-hole#environment-variables
- TZ=Australia/Sydney
- WEBPASSWORD=wouldntuliketoknow
- INTERFACE=eth0
- REV_SERVER=true
- REV_SERVER_DOMAIN=my.domain.com
- REV_SERVER_TARGET=192.168.132.65
- REV_SERVER_CIDR=192.168.132.0/24
ports:
- "53:53/tcp"
- "53:53/udp"
- "67:67/udp"
- "8089:80/tcp"
volumes:
- ./volumes/pihole/etc-pihole/:/etc/pihole/
- ./volumes/pihole/etc-dnsmasq.d/:/etc/dnsmasq.d/
dns:
- 127.0.0.1
- 1.1.1.1
cap_add:
- NET_ADMIN
The "REV_SERVER" vars mean "if it's a name-to-address query where name is in my.domain.com or it's an address-to-name query in the 192.168.132.0/24 subnet, forward the query to 192.168.132.65".
I have to admit that I've never really thought about the dns:
directives. The PiHole GUI is set to round-robin 8.8.8.8, 8.8.4.4 and 1.1.1.1. I watched traffic for a while so I know it's doing that.
In context, that 127.0.0.1 should mean "this container". But why it's there ... dunno.
After implementing the Interim Fix, building Mosquitto fails at step 3/10 instead of step 2/10, just because it delays the fetch from those two URLs. Errors for both cases pasted below. Opening the links (https or http) in a browser on my PC downloads the .tar.gz file, and I'm able to browse around the index.
I will go ahead with the two system patches now. Then read through the exchange with senarvi, try some things, and report back if I discover something useful. Let me know how I can help.
Before adding the two lines to the DockerFile:
pi@raspberrypi:~/IOTstack $ docker-compose build --no-cache --pull mosquitto
WARNING: Some networks were defined but are not used by any service: iotstack_nw_internal, nextcloud_internal
Building mosquitto
Step 1/9 : FROM eclipse-mosquitto:latest
latest: Pulling from library/eclipse-mosquitto
Digest: sha256:ce08d3fe69d4170cea2426739af86ac95e683f01dd2c4141da661983a2401364
Status: Image is up to date for eclipse-mosquitto:latest
---> 24a85c54a50e
Step 2/9 : RUN apk update && apk add --no-cache rsync tzdata
---> Running in 7b6df54c3d8d
fetch https://dl-cdn.alpinelinux.org/alpine/v3.14/main/armhf/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.14/community/armhf/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.14/main: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.14/main: No such file or directory
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.14/community: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.14/community: No such file or directory
2 errors; 20 distinct packages available
ERROR: Service 'mosquitto' failed to build: The command '/bin/sh -c apk update && apk add --no-cache rsync tzdata' returned a non-zero code: 2
After:
pi@raspberrypi:~/IOTstack $ docker-compose build --no-cache --pull mosquitto
WARNING: Some networks were defined but are not used by any service: nextcloud_internal, iotstack_nw_internal
Building mosquitto
Step 1/10 : FROM eclipse-mosquitto:latest
latest: Pulling from library/eclipse-mosquitto
Digest: sha256:ce08d3fe69d4170cea2426739af86ac95e683f01dd2c4141da661983a2401364
Status: Image is up to date for eclipse-mosquitto:latest
---> 24a85c54a50e
Step 2/10 : RUN sed -i 's/https/http/' /etc/apk/repositories
---> Running in cbec807a1ae6
Removing intermediate container cbec807a1ae6
---> 780f71221244
Step 3/10 : RUN apk update && apk add --no-cache rsync tzdata
---> Running in a9f4f56d4c68
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/main/armhf/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.14/community/armhf/APKINDEX.tar.gz
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.14/main: temporary error (try again later)
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.14/main: No such file or directory
2 errors; 20 distinct packages available
ERROR: http://dl-cdn.alpinelinux.org/alpine/v3.14/community: temporary error (try again later)
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.14/community: No such file or directory
ERROR: Service 'mosquitto' failed to build: The command '/bin/sh -c apk update && apk add --no-cache rsync tzdata' returned a non-zero code: 2
@ryan-gore very very interesting. And thanks for that feedback. Let's hope it also solves the problem for @senarvi
It was exactly that! I can't believe I missed this note from the bottom of your first post. Installing libseccomp2 fixed the problem. I didn't even have to replace https with http. I'll go on and check what else you suggested to update.
Brilliant! Meanwhile, over on Discord I posted this:
Are you doing something similar? Does that explain "172.30.224.1" in your output? If yes, how did you figure out that the router/default gateway on the internal network could be used to reach PiHole?
I had actually noticed this pattern before:
$ dig @192.168.132.102 +short testblock.my.domain.com
0.0.0.0
$ docker exec nodered dig @192.168.132.102 +short testblock.my.domain.com
;; reply from unexpected source: 172.18.0.1#53, expected 192.168.132.102#53
but it never occurred to me to try sending DNS requests to the virtual router on the internal bridged network.
Also, I still don't understand that non-recursive server warning. Are you using PiHole? I tried to force PiHole to become non-recursive but I could not figure out how to do it.
@Paraphraser I'm not using PiHole. Actually I see that a DNS server with an address starting with 172 is only used when I run the command in WSL, so I wasn't even running it on the Raspberry Pi. When I run it on the Pi, I don't get the "recursion requested but not available" warning, and it shows the DNS server of my ISP.
Tried everything, then had a light bulb moment and switched from my router wifi to phone hotspot (using cellular data) and worked like a charm.
Well, just one comment...
I had the error described on this Issue, I followed the suggestions here without luck, but then i figured out to keep installing everithing but mosquittto. itr worked fine.
when all was installed I followed the next steps on the guide: https://sensorsiot.github.io/IOTstack/Getting-Started/
and found about this patch:
step 2: if you are running "buster" …
You need this patch if you are running Raspbian Buster. Without this patch, Docker images will fail if:
the image is based on Alpine and the image's maintainer updates to Alpine 3.13; and/or
an image's maintainer updates to a library that depends on 64-bit values for Unix epoch time (the so-called Y2038 problem).
To install the patch:
$ sudo apt-key adv --keyserver hkps://keyserver.ubuntu.com:443 --recv-keys 04EE7237B7D453EC 648ACFD622F3D138 $ echo "deb http://httpredir.debian.org/debian buster-backports main contrib non-free" | sudo tee -a "/etc/apt/sources.list.d/debian-backports.list" $ sudo apt update $ sudo apt install libseccomp2 -t buster-backports
I patched it, and run menu.sh again, selected mosquitto, and all worked fine....
so.... as a sugestion, maybe theis patch whould be upper on the guide, before the installation part, something like "warning, before you start, check if you have buster and if so..."
regards!
I had same issue, updated libseccomp2 per the "Getting started" instructions without any luck. After also updating docker to v20+ per https://blog.samcater.com/fix-workaround-rpi4-docker-libseccomp2-docker-20/ it installs fine.
Problem
If you are reading this, it is probably because you have just tried to build Mosquitto and it has failed with an error pattern like this:
This seems to be a known issue. It is not an IOTstack problem and neither is it a Mosquitto problem. It's inherited from Alpine Linux. It was first reported in July 2020. See Docker Alpine issue 98.
It is not clear why it has suddenly started causing problems for IOTstack Mosquitto container builds. Neither is it clear why it only affects some IOTstack users.
Solution
The fix is to add these lines to Mosquitto's Dockerfile:
The completed Dockerfile should look like this:
Pull Requests
I have filed Pull Requests to implement this fix:
Interim fix
Edit the following file (
sudo
is not required):Add the two lines as above.
Rebuild Mosquitto:
Later, when the Pull Requests are applied, you will need to undo your local changes before Git will let you update against GitHub:
One more thought…
While you are fixing things, make sure you have implemented both recommended system patches.
The second patch (libseccomp2) is directly relevant to Mosquitto. See Issue 401 for more information.