spring-cloud / spring-cloud-dataflow

A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
https://dataflow.spring.io
Apache License 2.0
1.1k stars 578 forks source link

spring-cloud-dataflow vs spring cloud-function #1977

Closed cforce closed 6 years ago

cforce commented 6 years ago

mirrored question at see https://github.com/spring-cloud/spring-cloud-function/issues/137

sabbyanandan commented 6 years ago

Hi, @cforce.

Spring Cloud Function (SCFn) is a runtime agnostic framework for Java (see adapters; also fnproject). You get the familiar (Spring) programming model to develop standalone functions. The same function can be launched as a web-endpoint, stream, or as a task to a variety of runtimes, too. To do that, SCFn provides web, streaming, task integration with the help of relevant projects in the Spring Cloud ecosystem (e.g., Spring Cloud Deployer, Spring Cloud Stream, Spring Cloud Task).

SCDF is an orchestration service; you'd use it to choreograph data pipelines (via DSL/Dashboard/REST-APIs) made of a series of streaming-apps, task-apps, or both. In 1.3, you could use Skipper with SCDF to do continuous-delivery over granular streaming-apps in the data pipeline.

Also, you can orchestrate a data pipeline made of SCFn workloads in SCDF. There's a purpose-built OOTB function-runner app that makes it possible.

dataflow:>stream create foo --definition "http --server.port=9001 | function-runner --function.className=com.example.functions.CharCounter --function.location=file:///<PATH/TO/SPRING-CLOUD-FUNCTION>/spring-cloud-function-samples/function-sample/target/spring-cloud-function-sample-1.0.0.BUILD-SNAPSHOT.jar | log" --deploy

_(see full-example here)_

A developer will focus only on the business logic; in this case, the CharCounter class. In other words, you don't have to build a full-blown Stream/Task application.

Hope this helps. If you think this answers it adequately, please consider closing the issue.

cforce commented 6 years ago

Tx a lot for your explanation that helped a lot. I am asking if i can /should use such architecture for IOT scenarious where i have 10t request per second. Currrently we have a microservice architecture where i compose function from grovvy code that is compiled on application start and ordered into a pipe calling one after the other in about 4 steps.. In fact it would from the functional view useful to be able to write this as spring cloud function (simple ) in any language and deploy sepately on any XaaS or even mix with managed functions from clooud providers. However our current app runn all this functions embedded in the request thread executed from cached groovy classes fom the classloader in the same process. I am aksing me what downsides, bad effects or even bottlenecks in SCDF i get using SCF's where every request would call about at least 4 seperate functions in a row instead one app. Additional the remoting between functions (i mean there is only the exchnage via kafka/rabbit which introduces remoting) and the time for the functions to tear up (or is there some hot loading from a in memory cached version which would only work on the same machine or container in terms of kubernetes) add on the overhead, doesn't it? Furthermore i need to access databases that i also have cached in shards and i don't see the statefull (in meme) apps here that can guarantee to handle that load.

Does ist even make sense to continous datflow from IOT load devices with constant (despite in the night) to have this SCDF plattform in use or is it better to have dynamic scaled microservice instances (web-endpoint+reactive application) ?

sabbyanandan commented 6 years ago

This is getting more deeply into solution architecture. StackOverflow is a great forum for this type of questions.