Closed barby1138 closed 12 months ago
@barby1138 At the time that it was disabled, DPDK tended to break catastrophically in when run in Pods. It makes a huge number of presumptions about owning the whole box, and if any of them are not met it simply crashes (and takes VPP with it). We haven't revisited it lately, maybe it's better, but that's why we by default disabled it for things like the forwarder and our NSCs that are using memif.
It would be quite easy to re-enable it for a specific NSE that is binding to a NIC with DPDK if one could sort out the conflict between DPDK and K8s.
Hi Ed, Think it was long time ago. I use 22/23 vpp / dpdk in production in k8s env during the last year and it works perfect like a charm. I use HP 1G enabled. So it worth considering to bring it back as there is not much sense in using memif and then fall back to kernel to pass traffic between nodes / clusters. This is about forwarders only - per my opinion - no need to use it in NSC or NSE. Regarding VCL - if we work in scope of socket not packet - it's better to use it not memif - also no need to bring vpp in application pods. IMHO this is definitely worth bringing as well.
Thank you. Have a nice day!!!
@barby1138 Question: are you hand setting the ulimits on locked memory for your cluster? I ask, because I specifically remember one of the issues being that DPDK insisted on locking a certain amount of memory (even though most K8s clusters have no swap). You could hack around it by fixing the ulimits on all of your nodes... but when someone simply deploys in a vanilla K8s environment without tweaking it had a very high probability of blowing up.
@barby1138 A good quick test would be: does DPDK work out of the box in a Kind env? That's about as vanilla as things get.
Hi Yes - sure. The only requisite is to have 1G HP configured at setup. And of cource every setup has it's specific DPDK device pci address. I can share the patch-forwarder-vpp.yaml I use for it - if interested.
Good thing about fwder vpp is that if I put my startup.conf file to mounted /etc/vpp/helper it uses it not the default one. So generally - DPDK could be enabled if needed in the existing solution. Just wondered why it's disabled by default.
What about vcl?
@barby1138 I'd love to see your patch-forwader-vpp.yaml :)
And glad you found the 'stomping' feature for startup.conf useful :) We built it that way intentionally because we were certain folks would encounter times and places they needed to customize.
WRT vcl... I'm curious what you are thinking. I'm generally a vcl fan.
Hi Ed,
patch-forwarder-vpp.yaml is the following
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: forwarder-vpp
spec:
template:
spec:
containers:
- name: forwarder-vpp
resources:
requests:
hugepages-1Gi: 2Gi
memory: 2Gi
limits:
hugepages-1Gi: 2Gi
memory: 2Gi
volumeMounts:
- name: etcvpp
mountPath: /etc/vpp
- name: hugepage
mountPath: /dev/hugepages/
- name: vpp
mountPath: /var/run/vpp
volumes:
- name: etcvpp
hostPath:
path: /etc/vpp
type: Directory
- name: hugepage
emptyDir:
medium: HugePages
- name: vpp
hostPath:
path: /var/run/vpp
type: DirectoryOrCreate
Just make sure 1G hugepages are configured at setup. I do it via grub
add /etc/vpp/helper/vpp.conf with enabled DPDK plugin / device
Regarding VCL I ll write you back later - need to gather thoughts :)
Have a nice day!!!
Just make sure 1G hugepages are configured at setup. I do it via grub
Yeah... we don't always have that much control over the environments we deploy into. While I completely concur you want things like Hugepages (and core pinning etc) for optimal performance... our out of the box configs are optimized for two things:
Your config above is a great example of principle (2) :)
Hi Ed,
I took some holidays for a couple of days.
so VCL:
I will try to describe how I see vcl feature is represented in nsm. I will try to describe the simplest setup but with look for scalability.
Ex. I ll add some reference bash code snapshot how I configure VCL manually for now
VPP_POD1=$(kubectl --kubeconfig=$KUBECONFIG1 get pod -l app=forwarder-vpp -n nsm-system -o jsonpath="{.items[0].metadata.name}")
echo $VPP_POD1
VPP_POD2=$(kubectl --kubeconfig=$KUBECONFIG2 get pod -l app=forwarder-vpp -n nsm-system -o jsonpath="{.items[0].metadata.name}")
echo $VPP_POD2
PCIDEV1=vxlan_tunnel1
VCL_IP_ADDR1=172.17.1.8/16
PCIDEV2=vxlan_tunnel1
VCL_IP_ADDR2=172.17.1.9/16
kubectl --kubeconfig=$KUBECONFIG1 -it exec $VPP_POD1 -n nsm-system -- vppctl set int ip addr $PCIDEV1 $VCL_IP_ADDR1
kubectl --kubeconfig=$KUBECONFIG1 -it exec $VPP_POD1 -n nsm-system -- vppctl sh int addr
kubectl --kubeconfig=$KUBECONFIG2 -it exec $VPP_POD2 -n nsm-system -- vppctl set int ip addr $PCIDEV2 $VCL_IP_ADDR2
kubectl --kubeconfig=$KUBECONFIG2 -it exec $VPP_POD2 -n nsm-system -- vppctl sh int addr
CLUSTER2_IP=<my.cluster2.IP>
SERVICE_NAME="vcl_data"
APP_POD1=$(kubectl --kubeconfig=$KUBECONFIG1 get pod -l app=app-1 -n ns-floating-kernel2ethernet2kernel -o jsonpath="{.items[0].metadata.name}")
CONF_NAME1=$APP_POD1$SERVICE_NAME
echo 172.17.1.8/16 > $CONF_NAME1
cp $CONF_NAME1 /etc/vpp
rm -f $CONF_NAME1
NSE_POD1=$(kubectl --kubeconfig=$KUBECONFIG2 get pod -l app=nse-kernel-1 -n ns-floating-kernel2ethernet2kernel -o jsonpath="{.items[0].metadata.name}")
CONF_NAME2=$NSE_POD1$SERVICE_NAME
echo 172.17.1.9/16 > $CONF_NAME2
scp $CONF_NAME2 root@$CLUSTER2_IP:/etc/vpp
rm -f $CONF_NAME2
Summary: The main idea - VCL connection is built just like kernel - but there is no taps. Instead of it fwder-VPP interfaces are configured and this info is injected to client/nse
May be you have better ideas how to enable VCL with even less effort.
Have a nice day!!!
Hi guys,
Do we have any progress here? Should I open this as feature request?
Jave a nice weekend!!!
@barby1138 Bear with me while I try to swap my VCL knowledge back into my brain :) If memory serves, VCL is 'setup' by a user by sending messages over a unix file socket, correct?
In which case it would work very much like memif. I like your idea of a vcl mechanism type. So maybe something like:
vcl://${service-name}/${optional requested filename of unix file socket}
Thoughts?
Hi Ed Glad to hear from you :) No ${optional requested filename of unix file socket} is not needed. VCL atractor (client) needs vcl configuration with socket, queues, secrets, etc. - but it's client logic not related to NSM - per my vision. Also control socket is shared to VCL client via mounted folder from fwder. So the only thing is needed from VPP fwder is to share control socket and to enable sessions in startup.conf
Described good here: https://www.envoyproxy.io/docs/envoy/latest/configuration/other_features/vcl refer to "Installing and running VPP/VCL"
yes, technical y it's like memif but in VCL we work with socket not packets and no need to bring VPP to clients just some libs - but it's client responsibility. For test nsc / nse we ll need it - I can help with it. I have working manually configured setup already
@barby1138 if you have something working feel free to open PR into deployments repo with your example ;)
Another option is that you could also put your configurations here, and we will add examples on our side.
Hi Denis
I will reply in new opened Feature: VCL #10023
thanks
Hi
I have several questions:
Why dpdk_plugin in VPP fwder is disabled by default? Its easy to bing it up but Im not sure it's supported well by NSM - is it part of concept?
I have 2+ clusters of same type services chained. Each cluster has 1 node with 2 VF dpdk interaface (in/out). Also graph runs in both directions (uplink/downlink) so every client is service. I want to apply different connection config profiles
Ex.
I use nse-remote-vlan and vpp forwarder to select between DPDK VFs. So I have such chain
|----> nse-remote-vlan-cluster1....nse-remote-vlan-cluster2 <---| | ................................................................................................ | cli1-type1 <->...VF1 ...<->...VF1... <-> ............................cli1-type2 cli2-type1.........................................................................cli2-type2
I want to connect clients in different maner dynamical y depend on load etc. using routing rules in forwarders Is this topology correct? If yes - how nse-remote-vlans provide routes between clusters? should it be additional configured in forwarder-vpp? If no what will you suggest? .