TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
992 stars 309 forks source link

Confluent Kafka Webhook Integration #6340

Open devaskim opened 1 year ago

devaskim commented 1 year ago

Summary

Our company wants to build robust data pipe line with no packet loss. Kafka is industrial standard, and Confluent Kafka is ready to use platform for solving such cases.

Current Situation

With existing TheThingsNetwork's payload formatter feature I can't transform uplink payload to the format required by Confluent Kafka REST API.

Why do we need this? Who uses it, and when?

Kafka is industrial standard, and Confluent Kafka is ready to use platform for solving such cases.

Proposed Implementation

Provide ready to use Webhook Integration Template for Confluent Kafka

Here is how I can push data manually to Confluent Kafka topic using curl utility.

curl \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Basic <BASE64-encoded-key-and-secret>" \
  https://pkc-419q3.us-east4.gcp.confluent.cloud:443/kafka/v3/clusters/lkc-gn5zwm/topics/topic-1/records \
  -d '{"value":{"type":"JSON","data": DEVICE_PAYLOAD_JSON }}'

The main section is data format:

{
    "value": {
        "type": "JSON",
        "data": DEVICE_PAYLOAD_JSON
    }
}

Contributing

Code of Conduct

KrishnaIyer commented 1 year ago

Our company wants to build robust data pipe line with no packet loss. Kafka is industrial standard, and Confluent Kafka is ready to use platform for solving such cases.

Do you mean loss of packets when device data is received by a gateway (on LoRa) or when an external integration receives data from The Things Stack?

If it's the former, then that's part of LoRaWAN protocol. Please check our documentation for steps to minimise packet loss.

If it's the latter, then we don't simply lose packets that arrived at The Things Stack nor do we have issue reports of this nature.

We only support plain JSON webhooks for now. We may revisit this topic later if necessary.

devaskim commented 1 year ago

@KrishnaIyer You are too fast in closing tickets without any discussion.

Our data pipe is more bigger than only TheThingsNetwork servers. And yes, on TheThingsNetwork side we don't have any problems with packet loss. But we need to push data further and there the loss could happen in case of high load, so we need a some persistent "storage" which is Confluent Kafka in our case.

We only support plain JSON webhooks for now. We may revisit this topic later if necessary.

Do you mean what TheThingsNetwork cannot put original JSON coming from an application formatter to another JSON which is compatible with Confluent Kafka format?

KrishnaIyer commented 1 year ago

And yes, on TheThingsNetwork side we don't have any problems with packet loss. But we need to push data further and there the loss could happen in case of high load

The Things Stack is only concerned with data that it outputs. How you use that data further is not within scope.

Do you mean what TheThingsNetwork cannot put original JSON coming from an application formatter to another JSON which is compatible with Confluent Kafka format?

No. I mean that we will only look into this later if we have a lot of user demand. At the moment, you will have to do this transformation yourself.

I will reopen this issue now. If there are enough thumbs up ( 👍 ) on this issue, we will consider this in the future.

devaskim commented 1 year ago

The Things Stack is only concerned with data that it outputs. How you use that data further is not within scope.

Please, forget about this discussion item. My question was just about new Webhook Integration type.

At the moment, you will have to do this transformation yourself.

Could I do this with some feature provided by TheThingsNetwork out of the box?

And thank for quick response, I really appreciate it!

adriansmares commented 1 year ago

You can create a stateless integration that maps between our JSON format and the format that Confluent Kafka uses, via webhooks. In general that is the expected work flow for integrations - to have an intermediate layer that translates our JSON format to the format which your integrations expect, if the final ingesting application cannot have an extra format supported natively.

Webhooks allow you to do this translation in a scalable and highly available manner, and this is the main way we suggest that such integrations should be designed.

devaskim commented 1 year ago

You can create a stateless integration that maps between our JSON format and the format that Confluent Kafka uses

@adriansmares Did you mean something like AWS Lambda or some Things Stack feature? I already use the former for aforementioned goal

KrishnaIyer commented 1 year ago

Yes, something like AWS Lambda indeed.