vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.07k stars 1.59k forks source link

Support PubSub attributes for `gcp_pubsub` sink #17265

Open galah92 opened 1 year ago

galah92 commented 1 year ago

A note for the community

Use Cases

I'd like to be able to specify GCP PubSub message attributes, dynamically, when using the gcp_pubsub sink.

My use case is forwarding messages from RabbitMQ to GCP PubSub, and I'd like to forward RabbitMQ message routing_key as a GCP PubSub message attribute. If I try the following VRL:

[transforms.my_transformation]
type = "remap"
inputs = ["my_input"]
source = '''
    .attributes.deviceId = split(string!(.routing), ".")[1]
    .data = .message
    del(.exchange)
    del(.message)
    del(.offset)
    del(.routing)
    del(.source_type)
    del(.timestamp)
'''

I'm getting a correct PubsubMessage, but all of that is getting encoded as the data field of PubSubMessage instead of being the actual PubSubMessage: https://github.com/vectordotdev/vector/blob/7570bb31e2f471e3ff8bc818c24e9bde3090818c/src/sinks/gcp/pubsub.rs#L192-L202

Attempted Solutions

No response

Proposal

I would love to contribute a change to that myself, but not sure how to approach this without breaking changes.

References

No response

Version

vector 0.28.0

spencergilbert commented 1 year ago

Yep, we're just encoding the entire event into the data field of that PubsubMessage. I'm not overly familiar with the service, do you have any other examples of what could get put into the attributes? Or is it just arbitrary metadata so "it depends"?

It does feel like a reasonable addition to make to the sink 👍

galah92 commented 1 year ago

I'm not overly familiar with the service, do you have any other examples of what could get put into the attributes? Or is it just arbitrary metadata so "it depends"?

Arbitrary metadata, attributes are simple key-value strings. These can be use for efficient routing in the PubSub ecosystem.

In my specific example I'd like to forward part of AMQP message routing key as attribute, so for a routing key devices.device1.state I would extract device1 and set an attribute { "deviceId": "device1" }.

spencergilbert commented 1 year ago

Yep, sounds reasonable to me. I imagine it could work similarly to the loki sink's labels configuration.

sergialonsaco commented 6 months ago

is there any news regarding this? It's an enhancement I would like to have, too. What's the best approach to get it done? Thx!