Closed hannesrd closed 5 months ago
Is this issue intermittent or can you not connect to pmc-geofence.trafficmanager.net at all? Can you maybe paste the output of curl -v https://pmc-geofence.trafficmanager.net/
?
@hannesrd note that mssql packages are delivered via geofence (which is required for legal/tax reasons). If you have network restrictions in-place, they may block your access to the geofence infrastructure. The curl -v
output would likely clarify that.
[ 4/21] RUN curl -v https://pmc-geofence.trafficmanager.net/ 0.073 % Total % Received % Xferd Average Speed Time Time Time Current 0.073 Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 Trying 137.117.241.158:443... 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 Connected to pmc-geofence.trafficmanager.net (137.117.241.158) port 443 (#0) 0.135 ALPN: offers h2,http/1.1 0.135 } [5 bytes data] 0.135 TLSv1.3 (OUT), TLS handshake, Client hello (1): 0.135 } [512 bytes data] 0.161 CAfile: /etc/ssl/certs/ca-certificates.crt 0.161 CApath: /etc/ssl/certs 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:03 --:--:-- 0
.....
0 0 0 0 0 0 0 0 --:--:-- 0:02:00 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:02:01 --:--:-- 0
Recv failure: Connection reset by peer 122.1 OpenSSL SSL_connect: Connection reset by peer in connection to pmc-geofence.trafficmanager.net:443 0 0 0 0 0 0 0 0 --:--:-- 0:02:02 --:--:-- 0 122.1 Closing connection 0 122.1 curl: (35) Recv failure: Connection reset by peer ERROR: process "/bin/sh -c curl -v https://pmc-geofence.trafficmanager.net/" did not complete successfully: exit code: 35
I don't get why it is possible from other locations in our network.
We are using gitlab-runner in kubernetes.
@daviddavis @mbearup I forgot to tag you. Informations are pasted in last comment.
@hannesrd I'm at a bit of a loss here.
$ curl -H "Host: pmc-geofence.trafficmanager.net" http://137.117.241.158/
alma/ ``` - HTTPS works too (have to use /etc/hosts trick to satisfy TLS) ``` $ ping pmc-geofence.trafficmanager.net PING pmc-geofence.trafficmanager.net (137.117.241.158) 56(84) bytes of data. 64 bytes from pmc-geofence.trafficmanager.net (137.117.241.158): icmp_seq=1 ttl=101 time=152 ms ... $ curl https://pmc-geofence.trafficmanager.net/Index of / Index of /
... ``` - We have a WAF enabled, but it would emit known error codes (403 or 429) if it was blocking your traffic. To my knowledge, there's no scenario where the WAF would drop a connection in this manner. - Since this is a connection failure, it's unlikely to appear in our access or WAF logs. You could try requesting a unique URL (i.e. /foo), which would help us find your request in the logs, but if the connection isn't successfully established, I suspect nothing will be logged. - You could try hitting this endpoint via HTTP (i.e. `curl http://pmc-geofence.trafficmanager.net/`). It's possible the failure is centered around the TLS handshake, so using HTTP may reveal more information. - You could try targeting a different App Gateway (i.e. `curl -v -H "Host: pmc-geofence.trafficmanager.net" http://168.63.54.159/`). I suspect this will fail the same way, but could rule out any issues with a specific AppGateway.
@mbearup thanks for the hints!
I ran today curl -v -H "Host: pmc-geofence.trafficmanager.net" http://168.63.54.159/test1-linux-package-repositories-issues-127 test1 from the Kubernetes-Worker-Node -> OK
curl -v -H "Host: pmc-geofence.trafficmanager.net" http://168.63.54.159/test2-linux-package-repositories-issues-127 test2 from the Gitlab Build Container -> failed , it's Docker in Docker, I don't get why this URL failed. Are there any Issues from other Cloud oder Docker-Users? Like:
https://developercommunity.visualstudio.com/t/Packages-for-mssql-tools-report-403/10613043?sort=newest
https://github.com/microsoft/linux-package-repositories/issues/119
https://github.com/microsoft/msphpsql/issues/1505
So I guess we seem to have a problem with this container-configuration.
curl to other websites like github works
http
curl http://pmc-geofence.trafficmanager.net/
[ 4/22] RUN curl http://pmc-geofence.trafficmanager.net/
0.073 % Total % Received % Xferd Average Speed Time Time Time Current 0.073 Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:03 --:--:-- 0 ...
0 0 0 0 0 0 0 0 --:--:-- 0:02:11 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:02:12 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:02:13 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:02:14 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:02:15 --:--:-- 0 135.2 curl: (56) Recv failure: Connection reset by peer ERROR: process "/bin/sh -c curl http://pmc-geofence.trafficmanager.net/" did not complete successfully: exit code: 56
importing cache manifest from registry-gitlab.relaxdays.de/team-devops/gtiops/ci-cd-workshop/test-apt:c93305f8f2736caad936ea3b6ca5023b82c84b92:
other host curl -v -H "Host: pmc-geofence.trafficmanager.net" http://168.63.54.159/
[ 4/22] RUN curl -v -H "Host: pmc-geofence.trafficmanager.net" http://168.63.54.159/ 0.075 % Total % Received % Xferd Average Speed Time Time Time Current 0.075 Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 Trying 168.63.54.159:80... 0.107 Connected to 168.63.54.159 (168.63.54.159) port 80 (#0) 0.107 > GET / HTTP/1.1 0.107 > Host: pmc-geofence.trafficmanager.net 0.107 > User-Agent: curl/7.88.1 0.107 > Accept: / 0.107 > 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:02 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:03 --:--:-- 0 ... 0 0 0 0 0 0 0 0 --:--:-- 0:02:13 --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:02:14 --:--:-- 0
Recv failure: Connection reset by peer 0 0 0 0 0 0 0 0 --:--:-- 0:02:15 --:--:-- 0 135.3 * Closing connection 0 135.3 curl: (56) Recv failure: Connection reset by peer ERROR: process "/bin/sh -c curl -v -H \"Host: pmc-geofence.trafficmanager.net\" http://168.63.54.159/" did not complete successfully: exit code: 56
importing cache manifest from registry-gitlab.relaxdays.de/team-devops/gtiops/ci-cd-workshop/test-apt:d560b6d5f697025c6936ecd4b7a4f4d6c3ce132f:
@mbearup @daviddavis We made a tcp-dump on the hosting node and are curios about MTU-Warnings. Log attached tcpdump.txt
@hannesrd regarding the other issues you linked above, all of those are HTTP 403 errors. As I mentioned earlier, that's a symptom of the WAF blocking requests. Since you're getting TCP connection failure, it's a different symptom.
Which is a relevant point ... for your test requests, surprisingly I see both in our logs. However, we did emit 403s for these urls (test1 and test2). This is because we have a rule which rejects unknown top-level folders (to filter out garbage requests i.e. /admin.php). So perhaps this was a poor test case.
We could try again with a different test url (i.e. /ubuntu/test1). However, this does provide useful information: your client seems to have experienced a connection failure, but the App Gateway did receive and respond to the request. And both the test1 and test2 requests came from the same IP, so I think we can rule out other requestors.
Looking at requests for mssql-tools18_18.2.1.1-1_amd64.deb
from the same IP, I see two...
One was at 2024-04-09T07:43:10Z and successful (200)
The other was at 2024-04-09T07:42:46Z and was aborted prematurely by the client (ERRORINFO_CLIENT_CLOSED_REQUEST
)
Also attaching the output of my test/tcpdump. My VM is set for MTU 1500 (which is fairly standard, in Azure and elsewhere) and I received no MTU fragmentation messages. tcpdump-1500.txt
@hannesrd I can only conclude that this is a local networking issue (perhaps related to docker-in-docker), Per the above, we support an MTU of 1500, and our service is receiving your requests. Something else seems to be interfering with the connection. Apologies that we can't provide more clarity here.
@mbearup thanks for the investigation on your side! This helped us very much. I see the issue is closed, but let me write down our results for who may find them:
MTU was the issue. The docker-in-docker container had 1500 while the outer container had 1450. I'm not sure why this was only a problem with this endpoint. I found it out bei setting "RUN ifconfig" in my Dockerfile. You may also check for different payloads like
RUN ping -M do -s 1401 -c 1 8.8.8.8
I tried several fixes by configuring the MTU in den gitlab-runner toml. Didn't change anything.
The solution was by setting it in the gitlab-ci.yaml.
build: services:
- name: docker:20.10.12-dind command:
- "--mtu=1450"
Source Bug: https://www.civo.com/learn/fixing-networking-for-docker Source Solution: https://gitlab.com/gitlab-org/gitlab/-/issues/27716#note_628181430
So: Fixed for me. We will include this in our template.
Describe the issue We are trying to install mssql-tools18 from different locations. Install in an docker build fails.
When did the issue occur?
Using Kubernetes Gitlab Runner.
If applicable, what package did you attempt to install, and from which repo?
mssql-tools18
deb [arch=amd64,armhf,arm64] https://packages.microsoft.com/debian/11/prod bullseye main
Steps to Reproduce
DockerFile containing
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - RUN curl https://packages.microsoft.com/config/debian/11/prod.list | tee /etc/apt/sources.list.d/msprod.list RUN apt update RUN ACCEPT_EULA=Y apt -y install mssql-tools18 RUN ACCEPT_EULA=Y apt -y install unixodbc-dev RUN ACCEPT_EULA=Y apt -y install msodbcsql18
for
RUN ACCEPT_EULA=Y apt -y install mssql-tools18
we get something likefor
RUN wget https://pmc-geofence.trafficmanager.net/
we get timeouts likeActual Result
Fail like above
unixodbc-dev and msodbcsql18 are installed
Expected Result
Installation of Package ssql-tools18
When I use another system I get a different IP for pmc-geofence.trafficmanager.net and everything works.
Screenshots
Additional context