extended k8s configuration capabilities for nodes and the coordinator

alibaba / GraphScope

🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba | 一站式图计算系统

https://graphscope.io

Apache License 2.0

3.32k stars 448 forks source link

extended k8s configuration capabilities for nodes and the coordinator #1839

Closed xarion closed 2 years ago

xarion commented 2 years ago

Is your feature request related to a problem? Please describe. Currently pod configurations are created by a script. So we do not have control over these configurations. This is making it difficult to customize the graphscope deployment. For example, adding a customized mount point, setting ephemeral storage, setting CPU limits to guide the scheduler, or assigning pods to nodes using labels is not possible.

Describe the solution you'd like Being able to configure the pods however we like.

Describe alternatives you've considered Currently we are setting k8s defaults for some of the critical issues (ephemeral storage). But it's not a valid solution for all of the problems.

Additional context

welcome[bot] commented 2 years ago

Thanks for opening your first issue here! Be sure to follow the issue template! And a maintainer will get back to you shortly! Please feel free to contact us on DingTalk, WeChat account(graphscope) or Slack. We are happy to answer your questions responsively.

sighingnow commented 2 years ago

Hi @xarion,

Supporting more complete kubernetes deployment options is indeed in our TODOs. I would like you know that currently we already allow the user to add custom volumes (see the k8s_volumes arguments in Session.init) and set the cpu/mem requests/limits (see graphscope.set_option).

We are planning to support a user-provided dict as arguments and merging the dict into our builtin deployment settings before sending to kubernetes. Such a feature will be hopefully released be the end of July.

Hope the information above could be helpful for you folks.

xarion commented 2 years ago

Thanks for the response @sighingnow. Unfortunately, although graphscope.set_option allows us to set the cpu/mem limits, those values are not used while creating the kubernetes configuration. Only the requested cpu/mem values are reflected to the config.

Also, as you mentioned, k8s_volumes allows us to mount drives, but it is in a limited way. We have been unsuccessful in our attempts to use this setting to increase the ephemeral storage. One way would be to attach a mount to the log folder. But unfortunately this is also not enough, because to my knowledge k8s_volumes are not attached on the Jupyter and coordinator pods. Even if it worked, this is a complex way of dealing with the kubernetes configuration.

I suggest instead of passing parameters to a dict, let us use the kubernetes way and define a yaml file.

xarion commented 2 years ago

Hey @sighingnow.

Unfortunately, although graphscope.set_option allows us to set the cpu/mem limits, those values are not used while creating the kubernetes configuration. Only the requested cpu/mem values are reflected to the config.

This is still causing a major headache for us.

sighingnow commented 2 years ago

Hi @xarion,

I think the k8s_volumes should be enough to set require storage and k8s_engine_cpu, k8s_engine_mem, etc. should be enough to set the cpu/mem limit.

assigning pods to nodes using labels is not possible.

We could add options like k8s_engine_pod_label={}, k8s_vineyard_pod_label={} to support that. Do you think such options enough for your cases? It they could work, we can try to include the implementation in the upcoming v0.17.0 release.

sighingnow commented 2 years ago

let us use the kubernetes way and define a yaml file.

That won't happen before v0.17.0, we need to investigate the schema the yaml file to define which options are customizable and which are not.

xarion commented 2 years ago

Hi @sighingnow

I think the k8s_volumes should be enough to set require storage and k8s_engine_cpu, k8s_engine_mem, etc. should be enough to set the cpu/mem limit.

Currently graphscope sets the "requests" parameter in kubernetes. This does not necessarily block the resources for the pod. So multiple tasks that request the max memory can be scheduled to the same node. Unfortunately, we are having this issue and we have hacky solutions to prevent this. To correctly fix this, the "limits" parameter should be set.

sighingnow commented 2 years ago

This does not necessarily block the resources for the pod.

The session (as well as graphscope.set_option) has a preemptive (see https://github.com/alibaba/GraphScope/blob/main/python/graphscope/config.py#L86). When it is set as False, the arguments k8s_engine_cpu, k8s_engine_mem, etc. will be use as "required", rather than "limit".

Currently graphscope sets the "requests" parameter in kubernetes. ..... To correctly fix this, the "limits" parameter should be set.

I guess you mean "requests" is what you want, but we currently set them as "limit".

xarion commented 2 years ago

You're right, I confused limits and requests in my post. preemptive=false resolved our headache. Now we can better schedule pods. For the future, deploying to our existing systems will require a more elaborate kubernetes configuration. So we are looking forward to the release of 0.17.0!

sighingnow commented 2 years ago

For the future, deploying to our existing systems will require a more elaborate kubernetes configuration.

FYI: node selector has been added in https://github.com/alibaba/GraphScope/pull/2087

sighingnow commented 2 years ago

Closing as problems raise in this issue has been resolved. Feel free to open new tickets if you have other feature requests.