tensorflow / tfx

TFX is an end-to-end platform for deploying production ML pipelines
https://tensorflow.org/tfx
Apache License 2.0
2.09k stars 693 forks source link

[Help needed] Pass config to custom executor operator #5054

Open jccarles opened 1 year ago

jccarles commented 1 year ago

Hello ! I am currently trying to migrate from a custom launcher to a custom executor operator as shown here: Migrate from custom Launcher to custom ExecutorOperator but I have some trouble understanding how to replicate the behavior offered by the previous component_config argument of the Launcher class. This allowed us to pass some very useful configuration. I see in the example given in the readme, in the DockerExecutorOperator, that a docker_config is initialized but no argument can be passed this far it seems to me.

Any help would be appreciated as this subject is quite complex I might have missed something. To give a bit more context, the goal here of the custom executor operator will be to spawn a compute cluster to use for the beam component computations then to tear it down in order to better manage ressources. So the new CustomExecutorOperator is inheriting from the BeamExecutorOperator.

Thank you in advance !

1025KB commented 1 year ago

what information are you trying to pass through component_config?

jccarles commented 1 year ago

I want to pass some strings which helps define the specifications of the distributed computing cluster I want to spawn. You think the type of information is relevant ?

jccarles commented 1 year ago

Also to be more precise, I would like to use my custom executor operator with the KubeflowDagRunner. I seem to have an additional issue which is to make aware the containerEntrypoint of my new executor operator. As the parameter custom_executor_operators is instantiated within the entrypoint main . In the migration documentation they advise the following

a. All first-party ExecutorOperators developed by TFX team should already be in the DEFAULT_EXECUTOR_OPERATORS dictionary in the Launcher module.

b. For custom ExecutorOperator not in above dict, please inject it into the launcher constructor via the custom_executor_operators argument.

But I don't see how I am suppose to either update the DEFAULT_EXECUTOR_OPERATORS dictionary or the other solution.

jccarles commented 1 year ago

I think in the end my issue is with the container_entrypoint.py for kubeflow orchestrated pipelines as it does not expose the parameters platform_config and custom_executor_operators.

The platform_config is not exposed nor passed to the Launcher, while the Launcher supports it. And the custom_executor_operators parameter is not exposed, but defined within the main then passed to the Launcher instantiation. This blocks user from passing a custom_executor_operator which he would have defined. And the best place to pass config for this operator looks to be the platform_config which is not passed either to the Launcher in this case.

Is this the right place to discuss this ? Should I be opening an issue/feature request regarding kubeflow container entrypoint file ?

Thank you in advance for your time.