conduitio-labs / conduit-connector-benthos

Conduit connector wrapping Benthos inputs and outputs
Apache License 2.0
1 stars 0 forks source link

Using benthos as a pipeline processor in conduit #4

Closed nickchomey closed 3 months ago

nickchomey commented 3 months ago

I'm relatively new to these sorts of tools, but it seems to me that Conduit and Benthos are somewhat redundant as they are both stream processors. As such, it seems somewhat silly to use them together - better to choose one.

The main difference between them seems to be that Conduit is much more focused on CDC from data sources/stores/bases while Benthos is much more focused on the actual pipeline processing/transformation - it has many dozens of processors while Conduit has only a handful. Their processors also allow for enrichment via Sql queries, nats kv etc...

It seems to me that the universal/OpenCDC of Conduit is far more fundamental/important, since a pipeline ultimately needs to start from some data source, and should therefore be used as the main tool. But it would be a shame to not leverage the immense processing power of Benthos.

So, what I'm thinking is that rather than use Benthos as a Conduit Source/Destination, as was attempted in this repo, why not just embed it's pipeline processors into Conduit as a Standalone Processor? It could have some sort of Benthos bloblang mechanism for choosing the desired Benthos processors.

This woild allow Conduit to focus on its strength of CDC, while leveraging Benthos' strength in stream processing. You could, of course, always make other custom processors in Go or JavaScript to suit needs (or probably even use existing Benthos custom processors).

It's a topic that has been brought up various times in Benthos' Github and Discord, and they're generally responded to with the following links:

Apparently this can be used to embed Benthos into a golang app/binary https://pkg.go.dev/github.com/benthosdev/benthos/v4/public/service#example-package-StreamBuilderConfig

One more example of that api here https://github.com/benthosdev/benthos/issues/1727#issuecomment-1423210719

And here's a repo that apparently has relevant examples https://github.com/benthosdev/benthos-plugin-example

I'm new to golang, but I think I'd like to try to figure this out in the next couple weeks as a way of getting my feet wet. If I can get some guidance on it, I see no reason why it couldn't be achieved.

Thoughts?

lovromazgon commented 3 months ago

Thanks for opening this, it's an idea worh discussing. It's just not the correct place, since it's about adding a processor to Conduit itself - do you mind moving this into a discussion on the main Conduit repo? 🙏

Quick link: https://github.com/ConduitIO/conduit/discussions/new?category=ideas

I'll take my time to respond there, but I'll say this much - it's not trivial to do this as a standalone processor, because of the WASI limitations.

Meanwhile I'll close this issue so all further comments are redirected to the correct place.

nickchomey commented 3 months ago

Thanks for the prompt attention! Here's the new discussion https://github.com/ConduitIO/conduit/discussions/1614