Open zsherman opened 5 years ago
@binarylogic How would you recommend to handle this scenario in spring 2020? Having a lambda + vector with HTTP source? I saw, the HTTP source is not production ready yet.
Hi @sam701, this source is a little more tricky since:
Kafka, for example, handles a lot of this bookkeeping, making the integration much easier. That's why this source is not done yet. It's questionable if all of the above fits within the scope of Vector. Especially stream exclusivity, which would require distributed locking. That would obviously need to be delegated to a system designed for that.
The best solution, imo, is the wrap Vector in a system that handles the above. Which is easier said than done. But, for example, if Vector was integrated into an AWS Lambda function you could leverage AWS' kinesis -> lambda integration, which handles all of this for you.
Any news on this?
I'd also be appreciative of this feature being added. In terms of the complexity, logstash already supports this so perhaps its solutions to the locking problem/etc can be reused? https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kinesis.html
I'd also be appreciative of this feature being added. In terms of the complexity, logstash already supports this so perhaps its solutions to the locking problem/etc can be reused? https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kinesis.html
Looks like logstash is able to support cause is using the aws client, but there is no rust one at this moment: https://github.com/awslabs?q=kinesis-client&type=all&language=&sort=
I guess that we can call some java or python code from rust but I seems bad to me.
There is https://crates.io/crates/aws-sdk-kinesis linked to from https://awslabs.github.io/aws-sdk-rust/ so perhaps that have most of what's needed?
There is a client for Rust, already being used by the kinesis sink, but the issue is in the complexity of synchronization and shard discovery. The Logstash input uses the AWS KCL that needs a DynamoDB table (e.g. similar to how Kafka uses zookeeper for state). @binarylogic's comment explains well the complexity behind this.
I think however another option could be to run the AWS Java KCL MultiLangDaemon and pipe the output to a Vector stdin
source.
It would be nice if Vector could ingest logs from a AWS Kinesis data stream (not Firehose which is covered in #3566).
Requirements
kinesis.stream
andkinesis.partition
as context fields (.
denoting nested fields).