πFlow is an easy to use, powerful big data pipeline system.
Compatible with X86 architecture and ARM architecture, Support CentOS and Kirin system deployment
install external package
mvn install:install-file -Dfile=/../piflow/piflow-bundle/lib/spark-xml_2.11-0.4.2.jar -DgroupId=com.databricks -DartifactId=spark-xml_2.11 -Dversion=0.4.2 -Dpackaging=jar
mvn install:install-file -Dfile=/../piflow/piflow-bundle/lib/java_memcached-release_2.6.6.jar -DgroupId=com.memcached -DartifactId=java_memcached-release -Dversion=2.6.6 -Dpackaging=jar
mvn install:install-file -Dfile=/../piflow/piflow-bundle/lib/ojdbc6-11.2.0.3.jar -DgroupId=oracle -DartifactId=ojdbc6 -Dversion=11.2.0.3 -Dpackaging=jar
mvn install:install-file -Dfile=/../piflow/piflow-bundle/lib/edtftpj.jar -DgroupId=ftpClient -DartifactId=edtftp -Dversion=1.0.0 -Dpackaging=jar
mvn clean package -Dmaven.test.skip=true
[INFO] Replacing original artifact with shaded artifact.
[INFO] Reactor Summary:
[INFO]
[INFO] piflow-project ..................................... SUCCESS [ 4.369 s]
[INFO] piflow-core ........................................ SUCCESS [01:23 min]
[INFO] piflow-configure ................................... SUCCESS [ 12.418 s]
[INFO] piflow-bundle ...................................... SUCCESS [02:15 min]
[INFO] piflow-server ...................................... SUCCESS [02:05 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 06:01 min
[INFO] Finished at: 2020-05-21T15:22:58+08:00
[INFO] Final Memory: 118M/691M
[INFO] ------------------------------------------------------------------------
run piflow server on Intellij
:
download piflow: git clone https://github.com/cas-bigdatalab/piflow.git
import piflow into Intellij
edit config.properties file
build piflow to generate piflow jar:
Edit Configurations --> Add New Configuration --> Maven
Name: package
Command line: clean package -Dmaven.test.skip=true -X
run 'package' (piflow jar file will be built in ../piflow/piflow-server/target/piflow-server-0.9.jar)
run HttpService:
Edit Configurations --> Add New Configuration --> Application
Name: HttpService
Main class : cn.piflow.api.Main
Environment Variable: SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.6(change the path to your spark home)
run 'HttpService'
test HttpService:
run /../piflow/piflow-server/src/main/scala/cn/piflow/api/HTTPClientStartMockDataFlow.scala
change the piflow server ip and port to your configure
run piflow server by release version
:
download piflow.tar.gz:
https://github.com/cas-bigdatalab/piflow/releases/download/v1.2/piflow-server-v1.5.tar.gz
unzip piflow.tar.gz:
tar -zxvf piflow.tar.gz
edit config.properties
run start.sh、stop.sh、 restart.sh、 status.sh
test piflow server
set PIFLOW_HOME
vim /etc/profile
export PIFLOW_HOME=/yourPiflowPath/bin
export PATH=$PATH:$PIFLOW_HOME/bin
command
piflow flow start example/mockDataFlow.json
piflow flow stop appID
piflow flow info appID
piflow flow log appID
piflow flowGroup start example/mockDataGroup.json
piflow flowGroup stop groupId
piflow flowGroup info groupId
how to configure config.properties
spark.master=yarn spark.deploy.mode=cluster
fs.defaultFS=hdfs://10.0.86.191:9000
yarn.resourcemanager.hostname=10.0.86.191
data.show=10
server.port=8002
h2.port=50002
The version must be consistent with piflow-server
)
vim /usr/lib/systemd/system/docker.service
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix://var/run/docker.sock
systemctl daemon-reload
systemctl restart docker
flow json
{
"flow": {
"name": "MockData",
"executorMemory": "1g",
"executorNumber": "1",
"uuid": "8a80d63f720cdd2301723b7461d92600",
"paths": [
{
"inport": "",
"from": "MockData",
"to": "ShowData",
"outport": ""
}
],
"executorCores": "1",
"driverMemory": "1g",
"stops": [
{
"name": "MockData",
"bundle": "cn.piflow.bundle.common.MockData",
"uuid": "8a80d63f720cdd2301723b7461d92604",
"properties": {
"schema": "title:String, author:String, age:Int",
"count": "10"
},
"customizedProperties": {
}
},
{
"name": "ShowData",
"bundle": "cn.piflow.bundle.external.ShowData",
"uuid": "8a80d63f720cdd2301723b7461d92602",
"properties": {
"showNumber": "5"
},
"customizedProperties": {
}
}
]
}
}
CURL POST:
Command line:
set PIFLOW_HOME
vim /etc/profile
export PIFLOW_HOME=/yourPiflowPath/piflow-bin
export PATH=$PATH:$PIFLOW_HOME/bin
command example
piflow flow start yourFlow.json
piflow flow stop appID
piflow flow info appID
piflow flow log appID
piflow flowGroup start yourFlowGroup.json
piflow flowGroup stop groupId
piflow flowGroup info groupId
pull piflow images
docker pull registry.cn-hangzhou.aliyuncs.com/cnic_piflow/piflow:v1.5
show docker images
docker images
run a container with piflow imageID , all services run automatically. Please Set HOST_IP and some docker configs.
docker run -h master -itd --env HOST_IP=*.*.*.* --name piflow-v1.5 -p 6001:6001 -v /usr/bin/docker:/usr/bin/docker -v /var/run/docker.sock:/var/run/docker.sock --add-host docker.host:*.*.*.* [imageID]
please visit "HOST_IP:6001", it may take a while
if somethings goes wrong, all the application are in /opt folder
Login
:
Dashboard
:
Flow list
:
Create flow
:
Configure flow
:
Load flow
:
Monitor flow
:
Flow logs
:
Group list
:
Configure group
:
Monitor group
:
Process List
:
Template List
:
DataSource List
:
Schedule List
:
StopHub List
:
Name:Yang Gang, Tian Yao
Mobile Phone:13253365393, 18501260806
WeChat:13253365393, 18501260806
Email: ygang@cnic.cn, tianyao@cnic.cn
Private vulnerability contact information:ygang@cnic.cn
Wechat User Group
Wechat Official Account