Open woehrl01 opened 7 months ago
Apologies, the right way to set this is:
[settings.kernel.sysctl]
"net.ipv6.conf.all.optimistic_dad" = "1"
"net.ipv6.conf.default.optimistic_dad" = "1"
Thank you for your update. I would love to know if (as I hope) you see measurably faster startup with optimistic duplicate address detection.
@larvacea Unfortunately changing optimistic_dad = 1
or accept_dad = 0
regardless of the interface does not have any impact on the startup latency. There is currently still a 2-3 second delay on a IPv6 pod startup (compared to ipv4). I can confirm that the value is picked up by the vethd*
interfaces, created for the sandboxes.
@larvacea @woehrl01 a couple of other ideas:
Optimistic DAD might need to be combined with "use_optimistic", in order to actually make use of the tentative addresses. Also, given the evidence that DAD is being performed despite accept_dad = 0
, we could try setting dad_transmits = 0
to override it:
[settings.kernel.sysctl]
# don't enable DAD
"net.ipv6.conf.all.accept_dad" = "0"
"net.ipv6.conf.default.accept_dad" = "0"
# don't transmit any DAD probes
"net.ipv6.conf.all.dad_transmits" = "0"
"net.ipv6.conf.default.dad_transmits" = "0"
# if we end up using DAD, go ahead and use the tentative addresses
"net.ipv6.conf.all.optimistic_dad" = "1"
"net.ipv6.conf.all.use_optimistic" = "1"
Thank you @bcressey I just tried your configuration and also additonal permutations, the startup delay of around 2 second still persist:
bash-5.1# tail -n +1 /proc/sys/net/ipv6/conf/*/*dad*
==> /proc/sys/net/ipv6/conf/all/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/all/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/all/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/all/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/default/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/default/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/default/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/default/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/eni0dfdceb3448/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/eni19314c3cd96/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/eni19314c3cd96/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/eni19314c3cd96/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/eni19314c3cd96/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/eni79b4cbaf095/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/eni8d1aa624f0c/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/eni8f2e97e2322/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/enid559aefed0e/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/enid559aefed0e/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/enid559aefed0e/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/enid559aefed0e/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/enie114b69e62e/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/enie114b69e62e/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/enie114b69e62e/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/enie114b69e62e/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/eth0/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/eth0/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/eth0/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/eth0/optimistic_dad <==
0
==> /proc/sys/net/ipv6/conf/lo/accept_dad <==
-1
==> /proc/sys/net/ipv6/conf/lo/dad_transmits <==
1
==> /proc/sys/net/ipv6/conf/lo/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/lo/optimistic_dad <==
0
==> /proc/sys/net/ipv6/conf/veth1003d20a/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/veth1003d20a/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/veth1003d20a/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/veth1003d20a/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/veth1de24e44/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/veth1de24e44/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/veth1de24e44/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/veth1de24e44/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/veth2df40afd/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/veth2df40afd/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/veth2df40afd/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/veth2df40afd/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/veth4024dff7/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/veth4024dff7/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/veth4024dff7/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/veth4024dff7/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/veth524efe7e/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/veth524efe7e/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/veth524efe7e/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/veth524efe7e/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/vetha8d8bf98/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/vetha8d8bf98/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/vetha8d8bf98/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/vetha8d8bf98/optimistic_dad <==
1
==> /proc/sys/net/ipv6/conf/vethce2339e3/accept_dad <==
0
==> /proc/sys/net/ipv6/conf/vethce2339e3/dad_transmits <==
0
==> /proc/sys/net/ipv6/conf/vethce2339e3/enhanced_dad <==
1
==> /proc/sys/net/ipv6/conf/vethce2339e3/optimistic_dad <==
1
Digging through some code around the web, I came past the following implementation in Android: https://android.googlesource.com/platform/frameworks/base/+/befe778%5E%21/#F0
It looks like that if optimistic_dad is enabled the IFA_F_TENTATIVE
is set together with IFA_F_OPTIMISTIC
resulting in the following check in the AWS VPC CNI to still fail until the DAD has succeeded: https://github.com/aws/amazon-vpc-cni-k8s/pull/1631/files#diff-afc7977e1f00abb3f66455a7d491ded671d38ffa43e0dc910606084ec4fd4841R250-R255
Still not sure why IFA_F_TENTATIVE
is set when DAD is disabled. But I located the following (fixed) issue on Red Hat setting the address to tentative even if dad_transmits=0: https://bugzilla.redhat.com/show_bug.cgi?id=709271
edit:
I have some additional findings. Running the following script on a node with the above settings, clearly shows that there are no interfaces created in the tentative state. With default settings, the interfaces are shown in that state.
for i in {1..1000}; do ip -6 addr show | grep "tentative"; sleep 0.1; done
@woehrl01 based on your last update, this seems to be expected behavior, right?
I have some additional findings. Running the following script on a node with the above settings, clearly shows that there are no interfaces created in the tentative state. With default settings, the interfaces are shown in that state.
Do you mind clarifying the open request if one still exists?
@KCSesh there is a behaviour I don't understand. As the cni plugin clearly waits for 2 seconds in a tentative state even though DAD is disabled.
So the question is. Are there additional configurations which need to be applied to fully disable DAD, so that all interfaces are directly stable?
@woehrl01 I created a IPV6 Bottlerocket cluster with following configurations:
[settings.kernel.sysctl]
# don't enable DAD
"net.ipv6.conf.all.accept_dad" = "0"
"net.ipv6.conf.default.accept_dad" = "0"
# use initial net namespace IPv6 settings for new namespaces
"net.core.devconf_inherit_init_net" = "1"
This reduces the time in pod creation from Pulled to Scheduled step from 2-3 seconds to 0-1 seconds and disables DAD.
@vyaghras Thank you, I can confirm that adding net.core.devconf_inherit_init_net
works to successfully disable DAD.
What I'd like:
We want to change the sysctl value
net.ipv6.conf.all.optimistic_dad=1
We created a bootstrap container executing the following script:
but it fails with:
How to change that?
Any alternatives you've considered:
None that I'm aware of. Executing that from an admin-container via sheltie changes that value, successfully.
Related to: https://github.com/aws/amazon-vpc-cni-k8s/pull/1631