kubeflow / common

Common APIs and libraries shared by other Kubeflow operator repositories.
Apache License 2.0
51 stars 73 forks source link

Make training container port support multi ports #122

Closed xieydd closed 3 years ago

xieydd commented 3 years ago

Now common repo only supports single port for communicating with each other, but some scenarios need another pod for profiling or other user cases. Currently GetPortFromJob func only return single port for job. https://github.com/kubeflow/common/blob/1c55acc7fa15c38ad47a80fb8c97d8d63f9a8748/pkg/controller.v1/common/service.go#L256

Making port flexible, user can configure multi ports with the name and port for different role. /assign @xieydd /cc @gaocegege @terrytangyuan

terrytangyuan commented 3 years ago

Seems like a valid use case. WDYT? @gaocegege @Jeffwan

terrytangyuan commented 3 years ago

@xieydd Where will this be used? One of the training operators?

xieydd commented 3 years ago

Het, terry.One user case from here https://github.com/kubeflow/tf-operator/issues/1251#issuecomment-805448262.

Jeffwan commented 3 years ago

Em. Is this only for profiling? Any other use case needs a separate port? This should be a valid use case. If that's only for profiling, I am trying to see if we can abstract a generic way to provide the support or let users to determine the port used for profiling.

xieydd commented 3 years ago

@Jeffwan AFAIK, profiling is the only user case. Generic a profiling port for user is good, but i am a little confuse why we can not make it flexible?

Jeffwan commented 3 years ago

@xieydd Yeah, I just raise the idea. Never mind. Do you want to give it a try to support multiple ports?

xieydd commented 3 years ago

Yes, i will give a pr asap. Thanks for @Jeffwan and @terrytangyuan