DataFibers Data Services

DataFibers (DF) - A pure streaming processing application on Kafka and Flink. The DF processor has two components defined to deal with stream ETL (Extract, Transform, and Load).

Connects is to leverage Kafka Connect REST API on Confluent to landing or publishing data in or out of Apache Kafka.
Transforms is to leverage streaming processing engine, such as Apache Flink, for data transformation.

Building

You build the project using:

mvn clean package

Testing

The application is tested using vertx-unit.

Packaging

The application is packaged as a fat jar, using the Maven Shade Plugin.

Running

Once packaged, just launch the fat jar as follows ways

Default with no parameters to launch standalone mode with web ui.
```
java -jar df-data-service-<version>-SNAPSHOT-fat.jar
```

For more running features checking help option

java -jar df-data-service-<version>-SNAPSHOT-fat.jar -h

Web UI

http://localhost:8000/ or http://localhost:8000/dfa/

Manual

https://datafibers-community.gitbooks.io/datafibers-complete-guide/content/

Demo

Todo

[x] Fetch all installed connectors/plugins in regularly frequency
[x] Need to report connector or job status
[x] Need an initial method to import all available|paused|running connectors from kafka connect
[x] Add Flink Table API engine
[ ] Add memory LKP
[x] Add Connects, Transforms Logging URL
[ ] Add to generic function to do connector validation before creation
[x] Add submit other job actions, such as start, hold, etc
[ ] Add Spark Structure Streaming
[X] Topic visualization
[ ] Launch 3rd party jar
[ ] Job level control, schedule, and metrics

datafibers-community / df_data_service

readme