The goal of the RFC is to standardize the name and the content of the azure-eventhub field.
Introduction
Explain the background, context, and motivation for the proposal.
Since its inception five years ago, the azure-eventhub input stored the "event hub metadata" (event hub name, consumer group, offset, and more with the input v2) in the azure field of type object.
However, since many integrations use the azure field as the root element for their specific fields (i.e. azure.activitylogs, etc), these integrations usually rename the azure field with the metadata as azure-eventhub to keep the metadata alongside the actual data.
The older integrations perform the rename azure > azure-eventhub, but the more recent integrations do not.
There are at least two practical problems here:
The input stores the metadata in a field that most integrations rename as the first step in the default pipeline.
All recent integrations do not rename the field, creating inconsistencies and potential conflicts.
Proposal
Detail the proposed changes, including technical specifications, diagrams, and examples if necessary.
I suggest:
Adopting the current defacto standard name azure-eventhub as the official metadata field name.
Documenting all the existing field content.
Change the input to store the metadata in the azure-eventhub field.
Change the input to make the azure-eventhub field optional to save storage, if required (default enabled).
Make sure all existing integrations work with azure-eventhub field.
Existing field content
The metadata field contains the following information.
Field
Description
Notes
azure-eventhub.eventhub
Event hub name
azure-eventhub.consumer_group
Name of the consumer group
azure-eventhub.enqueued_time
Timestamp of the time the message was published on the event hub
azure-eventhub.offset
Message offset in the event hub partition
azure-eventhub.sequence_number
Message sequence number in the event hub partition
azure-eventhub.partition_id
The partition ID of the message
since v2
azure-eventhub.partition_key
The partition key of the message
since v2 (optional)
Rationale
Justify the proposal by discussing the problem it solves and why this solution is chosen over alternatives.
Name
It is used for the majority of integrations.
It is backward compatible.
Since it's the same name as the input, conflicts are probably low.
If I could go back in time when the input was created, with today's experience I would call this field something like azure_eventhub_metadata. However, the azure-eventhub is good enough to represent the semantics.
Changing the field name would cause a breaking change that doesn't feel worth it, given the secondary role of the metadata field from the users' perspective.
Impact
Describe the expected impact on users, systems, and any potential side effects.
Since all integrations will use azure-eventhub field, we expect a reduction in mapping conflicts from
the azure field.
Security Considerations
Address any security implications of the proposal.
No security implications so far.
Backward Compatibility
Explain any effects on existing systems or versions.
We need to double-check if the rename processor in the existing integrations works correctly when there is no azure field in the message.
Implementation
Outline the steps needed for implementation, including timelines, milestones, and responsible parties.
### Tasks
- [ ] Update the input to store metadata in the `azure-eventhub` field
- [ ] Make the `azure-eventhub` field optional
- [ ] Add a rename processor to integrations not using `azure-eventhub` field yet
- [ ] Write a .md document or section that document the existing metadata content
Conclusion
Summarize the key points and restate the importance of the proposal.
Key Points Summary
Proposal Purpose: Standardize the azure-eventhub field name and content across integrations for consistency.
Background: Historical inconsistencies arose as the azure field was renamed to azure-eventhub in various implementations, causing confusion.
Current Issues: Varied naming has led to difficulties in field mappings and increased conflict risks among older and newer integrations.
Proposed Changes: Adoption of azure-eventhub as the official field name, documentation of existing field content, making the field optional, ensuring backward compatibility.
Expected Impact: Reducing mapping conflicts and enhancing harmony across diverse integrations through standardization.
Implementation Steps: Clear plan for execution, including updates to input settings, adding rename processors, and documenting existing metadata.
Importance of the Proposal
Ensures consistency and clarity in handling Azure Event Hub metadata across integrations.
Addresses ongoing conflicts, improving ease of integration across the ecosystem.
References
List any external references or documents cited in the RFC.
Abstract
The goal of the RFC is to standardize the name and the content of the
azure-eventhub
field.Introduction
Since its inception five years ago, the
azure-eventhub
input stored the "event hub metadata" (event hub name, consumer group, offset, and more with the input v2) in theazure
field of typeobject
.However, since many integrations use the
azure
field as the root element for their specific fields (i.e.azure.activitylogs
, etc), these integrations usually rename theazure
field with the metadata asazure-eventhub
to keep the metadata alongside the actual data.Here is an example:
Here are a few integrations that rename
azure
field with metadata intoazure-eventhub
:And others who do not rename the field:
The older integrations perform the rename
azure > azure-eventhub
, but the more recent integrations do not.There are at least two practical problems here:
Proposal
I suggest:
azure-eventhub
as the official metadata field name.azure-eventhub
field.azure-eventhub
field optional to save storage, if required (default enabled).azure-eventhub
field.Existing field content
The metadata field contains the following information.
azure-eventhub.eventhub
azure-eventhub.consumer_group
azure-eventhub.enqueued_time
azure-eventhub.offset
azure-eventhub.sequence_number
azure-eventhub.partition_id
azure-eventhub.partition_key
Rationale
Name
If I could go back in time when the input was created, with today's experience I would call this field something like
azure_eventhub_metadata
. However, theazure-eventhub
is good enough to represent the semantics.Changing the field name would cause a breaking change that doesn't feel worth it, given the secondary role of the metadata field from the users' perspective.
Impact
Since all integrations will use
azure-eventhub
field, we expect a reduction in mapping conflicts fromthe
azure
field.Security Considerations
No security implications so far.
Backward Compatibility
We need to double-check if the rename processor in the existing integrations works correctly when there is no
azure
field in the message.Implementation
Conclusion
Key Points Summary
Proposal Purpose: Standardize the
azure-eventhub
field name and content across integrations for consistency.Background: Historical inconsistencies arose as the
azure
field was renamed toazure-eventhub
in various implementations, causing confusion.Current Issues: Varied naming has led to difficulties in field mappings and increased conflict risks among older and newer integrations.
Proposed Changes: Adoption of
azure-eventhub
as the official field name, documentation of existing field content, making the field optional, ensuring backward compatibility.Expected Impact: Reducing mapping conflicts and enhancing harmony across diverse integrations through standardization.
Implementation Steps: Clear plan for execution, including updates to input settings, adding rename processors, and documenting existing metadata.
Importance of the Proposal
References