PipedreamHQ / pipedream

Connect APIs, remarkably fast. Free for developers.
https://pipedream.com
Other
8.94k stars 5.27k forks source link

[FEATURE] Container Image / Alternative Language Steps #798

Open rawkode opened 3 years ago

rawkode commented 3 years ago

Is your feature request related to a problem? Please describe.

Not really a problem, but the flexability to use other runtimes would be an interesting one for a few use-cases; such as image manipulation, binary protocols, and machine learning.

Describe the solution you'd like

In addition to "NodeJS Step", it would be rather cool to use other languages or container images with pre-built functions. The OpenFaaS project has a selection of images that provide functionality through the use of their fwatchdog binary that translates HTTP requests into stdin, and stdout into HTTP responses. I suspect Pipedream could use a similiar approach for chaining steps.

OpenFaaS watchdog:

https://docs.openfaas.com/architecture/watchdog/

A list of the OpenFaaS functions:

https://hub.docker.com/u/functions

mroy-seedbox commented 1 year ago

This feature would be amazing!

Just connect to a container registry (or use public containers from DockerHub or GitHub Container Registry), specify the command to run (if necessary) and the container configs (if any), and voilà! 🎉

Just need to add a way to specify the return value for the step (so that subsequent steps can use it). Maybe just use stdout (which could contain JSON)? Or look for a specific file (could be a parameter for the step: outputFile, which should be a JSON file).

Our primary use case would be to be able to run DBT core/CLI on Pipedream, instead of having to rely on DBT Cloud (and then we could delete our DBT Cloud account). Should be simple if we can just use a pre-built DBT container image, and run the command(s) that we need.

I think it's pretty important for the future of Pipedream, and to stand out from the competition. Just like Snowflake recently announced the Snowflake Container Services+Registry, with the ability to run any container inside of Snowflake: Pipedream customers could similarly run "anything" inside of Pipedream! (as workflow steps, of course)

If Pipedream doesn't plan to support this feature... then I guess we could run our containers in Snowflake. It's either that or ECS (or the equivalent from Azure & GCP), which requires a lot more effort to setup & orchestrate. Containers can also run on Lambda, but with a maximum runtime of 15 minutes (which should be enough for most workloads, but still not a fun limitation 😞).

mroy-seedbox commented 1 year ago

After further consideration, triggering container runs on Lambda is probably the best & simplest approach. So this feature isn't explicitly necessary for us at the moment!

Bonus: we can use flow control with Lambda sending a callback to Pipedream once it's done! 🎉 So the Pipedream workflow would remain idle until the container has finished running.

Still, for users who don't want to deal with ECS or Lambda, integrating the capability to run containers directly inside of Pipedream would remain a very powerful feature.

ctrlaltdylan commented 1 year ago

Just want to chime in from Pipedream.

Yes we've discussed this internally a few times, we know this would ultimately provide the most flexibility and unlock even more possibilities with building Pipedream workflows.

It's on our roadmap, and we're very interested in supporting this in the future so you can bring in libraries like pupeeter, youtubedl, ffmpeg, etc that rely on different runtimes or binaries.