[X] I have searched in the issues and found no similar issues.
Describe the proposal
To support deployment on K8S natively and smoothly, we may have to add the following support:
expose more fields in operator's CRD, such as RuntimeClassName, Tolerations, Annotation and Affinity, etc. Therefore the shuffle server cloud be deployed more flexible
LogHostPath and HostPathMounts may be refactored to be supplied by container runtime. As shuffle server may be deployed on mixed nodes, the HostPathMounts can be different on different hosts.
Add an cli binary to hide details of RSS operations: rolling upgrade, restart, fully upgrade and gray version etc.
vpc template support
service and network refinement:
shuffle server is a network traffic heavy application, it's not wise to use service to proxy external client's read/write request to shuffle server
coordinators' deployment may need some refine, in current arch, the replicate of coordinator can only one 1. Otherwise, there would be a brain split problem.
various bug fixes, such as init-containers resource request/limit.
Code of Conduct
Search before asking
Describe the proposal
To support deployment on K8S natively and smoothly, we may have to add the following support:
RuntimeClassName
,Tolerations
,Annotation
andAffinity
, etc. Therefore the shuffle server cloud be deployed more flexibleHostPathMounts
can be different on different hosts.Task list
Are you willing to submit PR?