open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.96k stars 2.3k forks source link

Create otelarrow exporter instance per distinct metadata combos #34178

Open kristinapathak opened 2 months ago

kristinapathak commented 2 months ago

Component(s)

exporter/otelarrow

Is your feature request related to a problem? Please describe.

I want to open Arrow streams where specified metadata sent per stream message all match on a per stream basis, similar to configuring metadata_keys in the batch processor.

Describe the solution you'd like

Similar to the batch processor, add the below configuration fields:

metadata_keys (default = empty): When set, this exporter will create one arrow exporter instance per distinct combination of values in the client.Metadata.
metadata_cardinality_limit (default = 1000): When metadata_keys is not empty, this setting limits the number of unique combinations of metadata key values that will be processed over the lifetime of the exporter.

Then, for each unique combination of metadata keys, an arrow exporter object is created. This object creates the number of streams specified by num_streams. The maximum possible number of open streams would be metadata_cardinality_limit * num_streams.

Note: there is nothing arrow exporter specific about this logic. It could be used for any exporter where unique objects are desired per metadata combination.

Describe alternatives you've considered

No response

Additional context

No response

github-actions[bot] commented 2 months ago

Pinging code owners:

jmacd commented 2 months ago

@kristinapathak Thanks for filing! I agree with the stated goal here. While the Arrow components can pass headers on a per-request level, this can present authorization challenges for vendors that would prefer to authorize based on HTTP-level headers.

As you wrote, this sounds like functionality that could be implemented in an exporterhelper module, but there may not be many components that want this. The only reservation I have is that this method will tie us to the legacy style of batching in processor components and will restrict us from using the new exporterhelper-based batching. Ideally, I think we could look for ways to improve exporterhelper to support both batching and exporting by metadata, but that can be a future project. I support adding this functiolnaity directly in the Arrow exporter especially if the logic can be kept mostly separate, which seems natural.

github-actions[bot] commented 2 weeks ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

kristinapathak commented 18 hours ago

@jmacd, I think this was fixed in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/34827? Can I close this?