Closed djenriquez closed 5 years ago
Ah, Dani helped me find that the solution is here: https://www.nomadproject.io/guides/integrations/consul-connect/index.html#cni-plugins
Be great to have this in the network stanza doc as well!
Thanks Dani! (Great talk on host volumes as well 😁)
It would be great if there was a better explantion here. It seems that this is a common issue with an uncommon answer. Plus the link above has retired.
I have a simple job->group->task with exec
driver and I am simply trying to see how things work. Well, all I want is network isolation with bridge
mode and the allocation fails with
failed to setup alloc: pre-run hook "network" failed: failed to configure networking for alloc: failed to configure network: failed to find plugin "bridge" in path [/opt/cni/bin]
--
but I do not use nor want to use cni. So, what is the deal? Reading the docs multiple times to see what I am missing, it does not say anywhere that cni plugins are needed to use the native bridge mode.
Please take this with a grain of salt, but I was able to get this issue fixed by running the following bash on any node that was running Consul Connect. (and adding them to my client setup script)
echo "=== Getting CNI Plugins for Consul Connect ==="
curl -L -o cni-plugins.tgz https://github.com/containernetworking/plugins/releases/download/v0.8.6/cni-plugins-linux-amd64-v0.8.6.tgz
sudo mkdir -p /opt/cni/bin
sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz
echo "=== Allowing container traffic thru bridge network to be routed via iptables ==="
echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
YMMV :)
@mikenomitch those sysctl settings are documented here https://www.nomadproject.io/docs/integrations/consul-connect#cni-plugins but maybe it could be surfaced better?
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
We're trying out the network namespace in hopes of moving away from the
network_mode: container
which currently has issues with task dependencies (sometimes the service comes up before the sidecar and the task fails since docker can't attach the container to a non-existent container's network). We need to be able to route to localhost in order to reach our sidecar proxy from our app service. Sounds like since network namespace allow traffic over loopback, this would have been the solution.It appears our machines are missing a binary of sorts? Should this dependency be added to the changelog? Maybe I'm doing something wrong?
Also, since I'm here, given the problem we're trying to solve above, will network namespace be the solution we are hoping it to be?
Nomad version
0.10.0-beta1 (Server + Clients)
Operating system and Environment details
Amazon Linux 2
Issue
Running a job def with a taskgroup network namespace but failing with:
Reproduction steps
Run the job below on a 0.10.0-beta1 nomad server on Amz-Linux-2 based server.
Job file (if appropriate)