Closed vsoch closed 6 months ago
okay tried this on the VM, restarting everything:
echo net.ipv4.ip_unprivileged_port_start=443 >> /etc/sysctl.conf
echo net.ipv4.ip_unprivileged_port_start=80 >> /etc/sysctl.conf
echo net.ipv4.ip_unprivileged_port_start=8080 >> /etc/sysctl.conf
sysctl -p
systemctl daemon-reload
systemctl restart docker
And then tried to recreate the docker compose setup, but I'm still getting this error:
[+] Running 2/2
✔ Network usernetes_default Created 0.1s
✔ Container usernetes-node-1 Created 0.1s
Error response from daemon: driver failed programming external connectivity on endpoint usernetes-node-1 (8e49fdcac74805e5a05c53aea638bfde6e1abeac6e59f673c3e91787000425ad): Error starting userland proxy: error while calling PortManager.AddPort(): cannot expose privileged port 6443, you can add 'net.ipv4.ip_unprivileged_port_start=6443' to /etc/sysctl.conf (currently 8080), or set CAP_NET_BIND_SERVICE on rootlesskit binary, or choose a larger port number (>= 8080): listen tcp4 0.0.0.0:6443: bind: permission denied
make: *** [Makefile:64: up] Error 1
That was a port that was previously working, so likely I need to start fresh, adding these ports to the initial docker-compose and then the system setup and trying the creation from scratch.
oh i see the issue - that parameter is for when it starts so I snuffed out the lower value! Going to try again and dangerously set it to 0 (don't worry this cluster is extremely isolated).
okay update - no issue with usernetes setup (e.g., I've added the ports and started and no error messages) and now instead of not connecting I'm just seeing an empty response:
$ curl -v http://localhost/api
* Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /api HTTP/1.1
> Host: localhost
> User-Agent: curl/7.81.0
> Accept: */*
>
* Empty reply from server
* Closing connection 0
curl: (52) Empty reply from server
My service, and ingress:
$ kubectl describe ingress
Name: ml-ingress
Labels: <none>
Namespace: default
Address: localhost
Ingress Class: <none>
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
localhost
/ ml-service:8080 (10.244.1.2:8080)
Annotations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 3m43s (x2 over 3m43s) nginx-ingress-controller Scheduled for sync
$ kubectl describe svc ml-service
Name: ml-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: run=ml-service
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.121.83
IPs: 10.96.121.83
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
Endpoints: 10.244.1.2:8080
Session Affinity: None
Events: <none>
I'll keep thinking about what the empty response might mean. If you have ideas let me know what we might try!
okay I had another idea, just for debugging! I looked up the node where the pod is running with our service:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ml-server-55d6f7b4c5-s6ffw 1/1 Running 0 96s 10.244.4.2 u7s-u2204-05 <none> <none>
Then shelled in, and tested that I could reach the service via the pod ip (I could)
$ make shell
docker compose exec -e U7S_HOST_IP=192.168.65.125 -e U7S_NODE_NAME=u7s-u2204-05 -e U7S_NODE_SUBNET=10.100.177.0/24 node bash
And then explicitly pinged the pod address, and that worked.
# curl -k 10.244.4.2:8080/api/ | jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 257 100 257 0 0 125k 0 --:--:-- --:--:-- --:--:-- 250k
{
"id": "django_river_ml",
"status": "running",
"name": "Django River ML Endpoint",
"description": "This service provides an api for models",
"documentationUrl": "https://vsoch.github.io/django-river-ml",
"storage": "shelve",
"river_version": "0.21.0",
"version": "0.0.21"
}
So that tells us that everything is running OK in the pod, but there is an issue with the service exposure. So exiting from there, now looking at what docker compose is mapping:
$ docker compose ps
WARN[0000] The "U7S_HOST_IP" variable is not set. Defaulting to a blank string.
WARN[0000] The "U7S_NODE_NAME" variable is not set. Defaulting to a blank string.
WARN[0000] The "U7S_NODE_SUBNET" variable is not set. Defaulting to a blank string.
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
usernetes-node-1 usernetes-node "/u7s-entrypoint.sh /usr/local/bin/entrypoint /sbin/init" node 32 minutes ago Up 32 minutes 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:2379->2379/tcp, :::2379->2379/tcp, 0.0.0.0:6443->6443/tcp, :::6443->6443/tcp, 0.0.0.0:8080->8080/tcp, :::8080->8080/tcp, 0.0.0.0:10250->10250/tcp, :::10250->10250/tcp, 0.0.0.0:8472->8472/udp, :::8472->8472/udp
We can see it's mapping port 8080, so we should be able to access the service from outside of that container?
$ curl -k localhost:8080/api/
curl: (52) Empty reply from server
That didn't work, along with all the derivatives. So the issue seems to be the service from inside docker-compose being exposed to the VM running the container. It actually seems to be present (it's not that it doesn't exist) but the reply is empty. And also just to clarify (because this is a common bug) the server is running from 0.0.0.0
and not localhost or 127.0.0.1
.
@AkihiroSuda my colleague had an insight that gave us (at least a solution for now) that allows us to ping the hostname running the pod directly! The missing piece was defining the hostPort, here is the diff for the relevant section.
apiVersion: apps/v1
kind: Deployment
metadata:
name: ml-server
spec:
selector:
matchLabels:
run: ml-service
replicas: 1
template:
metadata:
labels:
run: ml-service
spec:
containers:
- name: ml-service
image: ghcr.io/converged-computing/lammps-stream-ml:test-server
# These should be secrets, but OK to test
+ # EXTREMELY IMPORTANT: we need to set the host port so it's mapped to the same as usernetes
ports:
- containerPort: 8080
+ hostPort: 8080
- containerPort: 80
+ hostPort: 80
And then we can hit that endpoint from any node, explicitly targeting the host and port:
It would be good to eventualy figure out a solution so this just works across localhost, but this should work for us for now since we are prototyping the setup for basic experiments. Thanks to @milroy for figuring this out - we are unblocked with this fix! :tada:
Also @AkihiroSuda this work with usernetes is super cool and coming along quite nicely, and we have you to thank for that! I'm going to be sharing a tiny bit of it at FOSDEM in early February if you are interested. It's a big open source conference (and this is a DevRoom) so likely you've heard of it, but I wanted to share so we can have good collaboration across our HPC and cloud communities.
Nice 👍 , I'm not likely going to FOSDEM this year, but I'll check the slides online
Wow time flies - thanks again for your help on this @AkihiroSuda ! I thought of it because I'm running this again, just on slightly larger / better infrastructure (network and scale wise). To return to our last correspondence, for those interested in the talk, it's the Bare Metal Bros and was really fun to do - we are hoping to extend this to a reproducible setup for others to use (actually I'm mostly done with that as of this week, just tidying it up for our own experiments). AWS uses the elastic fiber adapter, and getting that working with EFA and usernetes took some elbow grease!
I can confirm this strategy to expose the service via the docker-compose, and using hostPort, (still) works like a charm, but now on AWS with efa. I'm good to close, thank you again! And thank you for everything that you do for our communities, it's really admirable and inspriing.
I know that I need to add additional ports to the docker-compose.yaml for them to be exposed (e.g., a service running in a pod). I did this for several, and tried both with and without https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml but I can't seem to access the service. For detail, the service (confirmed working with this setup on my local machine, and inside the pod via a curl to localhost) should be exposed with this service setup:
And the selector for the pod:
But it can't seem to see it:
From the inside the pod, this is what we should see (but with the service, without 8080):
I'm thinking about this more, and I think I need to force a recreate of the container (I just restarted) so I'll try that and report back! I didn't do that because I was afraid I'd lose the
join-command
but that's not a big deal to redo. E.g.,