apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.88k stars 4.27k forks source link

Populate labels and custom metadata on VMs that serves Dataflow #19453

Open kennknowles opened 2 years ago

kennknowles commented 2 years ago

For now, Apache Beam on Google Dataflow doesn't provide functionality to pass custom labels and metadata on VM instances that serve Dataflow job. Only labels on Job is available. 

Actually com.google.api.services.dataflow.model.WorkerPool alredy has the field metadata but metod setMetadata never using.

Need to add functionality to provide custom labels and metadata on VM instances via running Dataflow job on Google cloud.

Imported from Jira BEAM-6832. Original Jira may contain additional context. Reported by: alex3.14.

prasrvenkat commented 2 years ago

@kennknowles @damccorm Would you know if this is being worked on? If not can I take it? Let me know.

kennknowles commented 2 years ago

You can use DataflowPipelineOptions.setLabels today.

prasrvenkat commented 2 years ago

You can use DataflowPipelineOptions.setLabels today.

Ok awesome thank you

tahsib-optimizely commented 2 months ago

how to add metadata?