Closed luischinchillagarcia closed 4 years ago
Hi @luischinchillagarcia , to answer your first question: many stock version executors are implemented using beam pipeline, for example, CsvExampleGen, and the beam_pipeline_args
is for that. beam_orchestrator_args
is for the BeamDagRunner
which orchestrates tasks whose executors can be arbitrary, so they are different.
In short, beam_orchestrator_args
is for BeamDagRunner
and beam_pipeline_args
is for executors that use beam pipeline to do the job.
@luischinchillagarcia I hope @numerology answered your question pretty well. Can I close this issue?
@numerology Thank you for your response. It definitely clears up the difference between beam_orchestrator_args
and beam_pipeline_args
.
However, @gowthamkpr, I’m still left unsure about the relationship between ‘additional_pipeline_args` and the latter two arguments.
Secondly, does this mean that, assuming we are using the stock executors, all option arguments should be the exact same ones as beam arguments? In other words, both would have the exact same list of arguments that just work to specify for the orchestration or the individual components?
relationship between ‘additional_pipeline_args` and the latter two arguments.
beam_pipeline_args
is a field in additional_pipeline_args
. The latter is a nested dictionary. And these two args are 'orthogonal' to beam_orchestrator_args
. additional_pipeline_args
can also include other stuff like workflow id when running on KFP, etc.
In other words, both would have the exact same list of arguments that just work to specify for the orchestration or the individual components?
They are all beam arguments. One thing worth mentioning here is that the legitimacy of arguments can only be considered all together instead of individually. For example when we have runner=DataflowRunner
we can specify project
region
etc, which make no sense when running locally. example
Perfect. Thank you for your response!
BeamDagRunner
mentions the following,It's clear that beam_orchestrator_args are beam arguments, however, it is less clear what arguments beam_pipeline_args and additional_pipeline_args has (and why they are different).
Would it be possible to get the list of arguments for beam_orchestrator_args, beam_pipeline_args, and additional_pipeline_args? In addition, a stronger description to remove the ambiguity between them?