Closed qchenzi closed 4 days ago
Looks good to me. @qchenzi Would you prefer to add it by yourself?
Sure, I can try to add it myself. But I'm not entirely clear on the related logic and where to start. Could you please provide some guidance or suggestions, particularly on which part of the Milvus Operator configuration I should focus on and any specific components or files that are critical for implementing the host network mode? Your help would be greatly appreciated. Thank you!
If I understand the feature correctly. We should add fields in Milvus CR
to adjust the spec.hostNetwork
& spec.dnsPolicy
in Pod Template. The final pod manifest should be look like below:
apiVersion: v1
kind: Pod
metadata:
name: busybox
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
ref: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
First we need add hostNetwork
& dnsPolicy
fields in ComponentSpec
struct located in https://github.com/zilliztech/milvus-operator/blob/592de3d6e6f7b7a43fbd11c5f2e99d211b036767/apis/milvus.io/v1beta1/components_types.go#L33
And use generate-all
to update the CRD manifests & the structs' deep copy functions.
By doing these, we actually add configuration fields like the 2 cases below:
# Case 1: configure all components to hostnetwork
apiVersion: milvus.io/v1beta1
kind: Milvus
metadata:
name: my-release
labels:
app: milvus
spec:
components:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
# Case 2: configure some of the components to hostnetwork
apiVersion: milvus.io/v1beta1
kind: Milvus
metadata:
name: my-release
labels:
app: milvus
spec:
components:
proxy:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
mixcoord:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
Then we shall add render logics to add these fields in deploy.spec.podTemplate in https://github.com/zilliztech/milvus-operator/blob/592de3d6e6f7b7a43fbd11c5f2e99d211b036767/pkg/controllers/deployment_updater.go#L80
Then it's also required to add unittests in https://github.com/zilliztech/milvus-operator/blob/main/pkg/controllers/deployment_updater_test.go to verify the render logics works.
Hi @haorenfsa
I have submitted a pull request that supports host network for component. https://github.com/zilliztech/milvus-operator/pull/141
After implementing the changes, can be seen as shown in the attached image:
And we can add configuration fields like this:
apiVersion: milvus.io/v1beta1
kind: Milvus
metadata:
name: my-release
labels:
app: milvus
spec:
components:
image: "milvusdb/milvus:v2.4.4"
proxy:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
Please review the pull request at your earliest convenience.
Hi @haorenfsa
I fought some new issues after implementing https://github.com/zilliztech/milvus-operator/pull/141, which makes me confused.
Although the replicas for all components are set to 1, I observe that during the initialization phase, the number of pods increases to 2 for each component, like the attached image below:
And this eventually resolves itself and the number of pods returns to the expected count of 1. But I have also noticed that there still are two deployments for querynode, like that image below:
Could you please provide any insights into why these issues might be occurring and how we can address them.
Hi @haorenfsa
I fought some new issues after implementing #141, which makes me confused.
Although the replicas for all components are set to 1, I observe that during the initialization phase, the number of pods increases to 2 for each component, like the attached image below:
And this eventually resolves itself and the number of pods returns to the expected count of 1. But I have also noticed that there still are two deployments for querynode, like that image below:
Could you please provide any insights into why these issues might be occurring and how we can address them.
However, after modifying the updateNetworkSettings function as shown below, these issues appear to be resolved:
Updated function:
func updateNetworkSettings(template *corev1.PodTemplateSpec, updater deploymentUpdater) {
mergedComSpec := updater.GetMergedComponentSpec()
template.Spec.HostNetwork = mergedComSpec.HostNetwork
if len(mergedComSpec.DNSPolicy) > 0 {
logf.Log.Info("update dns policy", "dnsPolicy", mergedComSpec.DNSPolicy, "component", updater.GetComponentName())
template.Spec.DNSPolicy = mergedComSpec.DNSPolicy
}
}
Original function:
func updateNetworkSettings(template *corev1.PodTemplateSpec, updater deploymentUpdater) {
mergedComSpec := updater.GetMergedComponentSpec()
template.Spec.HostNetwork = mergedComSpec.HostNetwork
template.Spec.DNSPolicy = mergedComSpec.DNSPolicy
}
It seems that the conditional check for DNSPolicy has addressed the issue, but I'm not entirely sure why this change resolved it.
One more thing to notice when enable this feature: Every milvus pod is using the port 9091. So it's advised that you should add anti-affinity for Milvus pods that enabled hostNetwork. Otherwise, you may failed to restart to scale out because the pods maybe scheduled to the same worker node.
Description
Milvus is often used for real-time data processing and large-scale vector similarity search, which require high throughput and low latency. Supporting host network mode at the Pod level can reduce network latency by eliminating container networking overhead, crucial for performance-sensitive applications.
Proposed Solution
Introduce an option in the Milvus Operator configuration to enable host network mode for specific components, such as the milvus-proxy, allowing users to opt-in based on their performance needs.
Benefits
Reduced Latency: Direct access to the host's network stack can significantly lower network latency. Improved Throughput: Enhanced network performance can increase query handling capacity. Flexibility: Users can optimize deployments based on specific requirements.
Supporting host network mode for specific components would greatly benefit real-time data processing applications in Milvus.
Thank you for considering this request.