vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
18.26k stars 1.61k forks source link

New `azure_event_hubs` sink #2434

Open binarylogic opened 4 years ago

binarylogic commented 4 years ago

Azure Event Hubs is like a managed Kafka service:

Allow existing Apache Kafka clients and applications to talk to Event Hubs without any code changes—you get a managed Kafka experience without having to manage your own clusters.

This suggests that we could wrap the existing kafka sink. I'd prefer that we wrap it since it is technically a different service. We also get the marketing benefit (guides, ets) when we create a new sink.

bigredmachine commented 4 years ago

I'm interested in Azure Event Hubs Sink Support.

More specifically from an end to end perspective, looking to scrape an applications prometheus metrics endpoint, convert it into Application Insights JSON metrics format (something like this http://apmtips.com/blog/2017/10/27/send-metric-to-application-insights/), and then send them in compressed batches to an Azure Event hub (preferred, or to Application Insights direct)

Appears the first part is supported.

Let me know if I can help, e.g. QA this out.

martinohansen commented 4 years ago

I can't get it working, so just to be sure: is it possible to use the Kafka sink for Azure Event Hub? And if so, is there any caveats i need to be aware of?

jamtur01 commented 4 years ago

@martinohansen when we say wrap the kafka sink we mean add a sink that reuses the kafka code customized for event hub. The straight kafka source will not directly work I suspect.

mdtro commented 3 years ago

For those that are also super excited to get vectordev to send into Azure Event Hub (Standard SKU required) as a sink. I was able to get this working this morning by following @joshmue's source configuration (see #4261) and the Microsoft documentation as inspiration. :)

librdkafka defaults the security.protocol setting to plaintext (see librdkafka docs here), but the Azure Event Hub service needs this value to be set as sasl_ssl.

From the Microsoft documentation:

bootstrap.servers=NAMESPACENAME.servicebus.windows.net:9093
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="{YOUR.EVENTHUBS.CONNECTION.STRING}";

My specific use case below uses a stdout source pulling Kubernetes logs, but you can adjust the sources to match your needs.

sinks:
  event_hub:
    type: kafka
    inputs: ["kubernetes_logs"]
    bootstrap_servers: "<EVENT_HUB_NAMESPACE>.servicebus.windows.net:9093"
    group_id: '$$Default'
    topic: "<EVENT_HUB_NAME>"
    encoding:
      codec: "json"
    healthcheck:
      enabled: true
    sasl:
      enabled: true
      mechanism: PLAIN
      username: "$$ConnectionString"
      password: "<YOUR_SAS_CONNECTION_STRING>"
    librdkafka_options:
      "security.protocol": sasl_ssl

The key part that got it working for me are those last two lines. Make sure the key is surrounded with quotes as it is expecting a string, not a map.

    librdkafka_options:
      "security.protocol": sasl_ssl

All that said, I'd love to give this a shot and implement this as a specific azure_event_hub sink wrapping the kafka sink. I'd need some mentorship on it, but I'm willing to put in the effort. :)

fpytloun commented 2 years ago

Here's functional setup for both sink and source:

## Kafka sink
[sinks.out_azure_events]
  type = "kafka"
  inputs = ["parse_log_json"]
  encoding.codec = "json"
  bootstrap_servers = "myhub.servicebus.windows.net:9093"  # [hub_namespace].servicebus.windows.net:90903
  topic = "vector"  # event hub name
  compression = "zstd"

  librdkafka_options."security.protocol" = "sasl_ssl"
  [sinks.out_azure_events.sasl]
    enabled = true
    mechanism = "PLAIN"
    username = "$$ConnectionString"
    password = "Endpoint=sb://myhub.servicebus.windows.net/;SharedAccessKeyName=vector-producer;SharedAccessKey=dummy;EntityPath=vector" # SAS connection string
## Kafka source
[sources.in_azure_events]
  type = "kafka"
  bootstrap_servers = "myhub.servicebus.windows.net:9093"  # [hub_namespace].servicebus.windows.net:90903
  topics = ["vector"]  # event hub name
  group_id = '$$Default'

  librdkafka_options."security.protocol" = "sasl_ssl"
  [sources.in_azure_events.sasl]
    enabled = true
    mechanism = "PLAIN"
    username = "$$ConnectionString"
    password = "Endpoint=sb://myhub.servicebus.windows.net/;SharedAccessKeyName=vector-producer;SharedAccessKey=dummy;EntityPath=vector" # SAS connection string

[transforms.azure_events_parse]
  type = "remap"
  inputs = ["in_azure_events"]
  source = '''
  . = parse_json!(string!(.message))
  '''

[sinks.console]
type = "console"
inputs = [ "azure_events_parse" ]
target = "stdout"
  [sinks.console.encoding]
  codec = "json"
dxlr8r commented 2 years ago

@fpytloun is that a suggestion for how it might look, or is the feature implemented? I could not find it in the documentation, source or commits.

jszwedko commented 2 years ago

@dxlr8r that will work now if you are running a version of Azure Event Hubs that supports the Kafka interface (I believe the basic tier does not).

fpytloun commented 2 years ago

Yes, in my opinion there's nothing to do under this issue, no code change needed and it's regular Kafka.

Maybe just extend kafka sink documentation with example on how to setup output into Azure Event Hubs Kafka API?

jszwedko commented 2 years ago

Agreed, we do plan to add this to the documentation. However the basic tier of Azure Event Hubs doesn't support connecting via Kafka so we plan to leave this open until we can add support for AMQP 1.0 which the basic tier of Azure Event Hubs does support.

minghuaw commented 11 months ago

Agreed, we do plan to add this to the documentation. However the basic tier of Azure Event Hubs doesn't support connecting via Kafka so we plan to leave this open until we can add support for AMQP 1.0 which the basic tier of Azure Event Hubs does support.

The azeventhubs crate (https://crates.io/crates/azeventhubs), which is implemented on top of the AMQP 1.0 protocol, may be useful. Disclosure, I am the author of the crate.

zapdos26 commented 2 months ago

It looks like Azure has been creating an SDK for Event Hub here: https://github.com/Azure/azure-sdk-for-rust/tree/feature/track2/sdk/eventhubs/azure_messaging_eventhubs