apache / incubator-streampark

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
https://streampark.apache.org/
Apache License 2.0
3.9k stars 1.01k forks source link

StreamPark FAQ #507

Open xinzhuxiansheng opened 2 years ago

xinzhuxiansheng commented 2 years ago

StreamPark logo

StreamPark ── A magical framework make flink&spark easier!

FAQ

Here is a compilation of frequently mentioned popular issues based on user feedback. If you have a new question, please submit issue . Please do not ask your question here. This is not a question area.


这里记录总结了用户反馈较多的热门问题, 如果你有新的问题,请提issue ,不要在这里提问. 不要在这里提问. 不要在这里提问. 这里不是提问区.

xinzhuxiansheng commented 2 years ago

1. maven install error,Failed to run task: 'npm install' failed?

69691639122373_ pic_hd

because the front end uses nodejs, make sure that nodejs is installed on the compiling machine when compiling, and make sure that the nodejs version is not too old. You can enter streamx-console-webapp and manually execute the cmd to try to compile: npm install, if still If it fails, you can check the information related to nodejs compilation by yourself and try to solve this problem by yourself

xinzhuxiansheng commented 2 years ago
  1. 530

exception like this:

Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn Application Cluster
at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:465)
at com.streamxhub.streamx.flink.submit.impl.YarnApplicationSubmit$$anon$1.call(YarnApplicationSubmit.scala:80)
at com.streamxhub.streamx.flink.submit.impl.YarnApplicationSubmit$$anon$1.call(YarnApplicationSubmit.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1926)
... 157 more
Caused by: java.lang.NumberFormatException: For input string: "30s"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.hadoop.conf.Configuration.getLong(Configuration.java:1435)
at org.apache.hadoop.hdfs.client.impl.DfsClientConf.(DfsClientConf.java:255)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:319)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:303)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:159)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3247)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3296)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3264)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:475)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:769)
at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:592)
at org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:458)
... 162 more

fixed , see #2443

xinzhuxiansheng commented 2 years ago
  1. after streamx-console started, app.home is not set, and throw NullPointerException

image

streamx-console initialization check failed. If started local for development and debugging, please ensure the -Dapp.home parameter is clearly specified in vm options, more detail: http://www.streamxhub.com/docs/user-guide/development/#vm-options

xinzhuxiansheng commented 2 years ago
  1. Cause: java.sql.SQLSyntaxErrorException: Table 'streamx.t_setting' doesn't exist

### Cause: java.sql.SQLSyntaxErrorException: Table 'streamx.t_setting' doesn't exist
; bad SQL grammar []; nested exception is java.sql.SQLSyntaxErrorException: Table 'streamx.t_setting' doesn't exist
    at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:160)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsBeforeInitialization(AbstractAutowireCapableBeanFactory.java:415)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1786)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:594)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:516)
    at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:324)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:322)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
    at org.springframework.beans.factory.config.DependencyDescriptor.resolveCandidate(DependencyDescriptor.java:276)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1307)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:1227)
    at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:640)
    ... 57 common frames omitted

StreamX v1.2.0以上版本以及主分支(不包含v1.2.0),需要手动执行sql脚本,初始化表结构,请参考此链接:https://github.com/streamxhub/streamx/tree/main/streamx-console/streamx-console-service/src/assembly/script

xinzhuxiansheng commented 2 years ago
  1. java.io.InvalidClassException: scala.collection.immutable.Set$EmptySet$

    
    2021-12-02 18:01:27 | INFO  | XNIO-1 task-4 | com.streamxhub.streamx.console.core.entity.Application ] local appHome:~/streamx_workspace/workspace/1466345568741457922
    2021-12-02 18:01:28 | INFO  | XNIO-1 task-4 | com.streamxhub.streamx.flink.proxy.FlinkShimsProxy ] [StreamX] 
    ----------------------------------------- flink version -----------------------------------
     flinkHome    : /data/flink-1.14.0
     distJarName  : flink-dist_2.12-1.14.0.jar
     flinkVersion : 1.14.0
     majorVersion : 1.14
     scalaVersion : 2.12
     shimsVersion : streamx-flink-shims_flink-1.14
    -------------------------------------------------------------------------------------------

java.io.InvalidClassException: scala.collection.immutable.Set$EmptySet$; local class incompatible: stream classdesc serialVersionUID = -1118802231467657162, local class serialVersionUID = -2443710944435909512 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)



Currently StreamX only supports scala 2.11, so you need to change the scala version of the custom program and the Flink pointed to by Flink_HOME to the 2.11 installation package. I have verified that the program and Flink installation package are replaced with 2.11 scala 

目前StreamX 仅支持scala 2.11 ,所以需要要将自定义程序的scala版本及 Flink_HOME指向的Flink 也改成2.11安装包, 本人已验证, 将程序及Flink安装包换成2.11的scala,可以了,OK :)

---
xinzhuxiansheng commented 2 years ago
  1. Caused: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

图片

配置文件缺失metastore.uri,添加上即可, 请参考#219

xinzhuxiansheng commented 2 years ago
  1. java.lang.RuntimeException: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory

    java.lang.RuntimeException: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory
    at com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl.execute(ApacheDockerHttpClientImpl.java:187)
    at com.github.dockerjava.httpclient5.ApacheDockerHttpClient.execute(ApacheDockerHttpClient.java:9)
    at com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:228)
    at com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: java.io.IOException: com.sun.jna.LastErrorException: [2] No such file or directory
    at com.github.dockerjava.transport.DomainSocket.<init>(DomainSocket.java:63)
    at com.github.dockerjava.transport.BsdDomainSocket.<init>(BsdDomainSocket.java:43)
    at com.github.dockerjava.transport.DomainSocket.get(DomainSocket.java:138)
    at com.github.dockerjava.transport.UnixSocket.get(UnixSocket.java:27)
    at com.github.dockerjava.httpclient5.ApacheDockerHttpClientImpl$2.createSocket(ApacheDockerHttpClientImpl.java:145)
    at org.apache.hc.client5.http.impl.io.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:125)
    at org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:407)
    at org.apache.hc.client5.http.impl.classic.InternalExecRuntime.connectEndpoint(InternalExecRuntime.java:168)
    at org.apache.hc.client5.http.impl.classic.InternalExecRuntime.connectEndpoint(InternalExecRuntime.java:178)
    at org.apache.hc.client5.http.impl.classic.ConnectExec.execute(ConnectExec.java:136)

Check if Docker is started


wolfboys commented 2 years ago
  1. 571

image

methods of resolution

Dependency hierarchy, see where the log conflicts are

fix udf log4j conflict

Requirement or improvement(诉求 & 改进建议)

add java parameter Start the job of Flink job add: -Dlog4j.ignoreTC=true

wolfboys commented 2 years ago
  1. Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in the classpath

2021-12-24T04:43:28.901627819Z Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in
2021-12-24T04:43:28.901630469Z the classpath.
2021-12-24T04:43:28.901632969Z
2021-12-24T04:43:28.901635342Z Reason: Required context properties mismatch.
2021-12-24T04:43:28.901637804Z
2021-12-24T04:43:28.901640154Z The matching candidates:
2021-12-24T04:43:28.901642578Z org.apache.flink.table.sources.CsvAppendTableSourceFactory
2021-12-24T04:43:28.901645059Z Mismatched properties:
2021-12-24T04:43:28.901647812Z 'connector.type' expects 'filesystem', but is 'kafka'
2021-12-24T04:43:28.901650202Z 'format.type' expects 'csv', but is 'json'

Flink version: 1.14.0

解决方案:flink-kafka-connector 的使用参数不对,请参考flink官网:

CREATE TABLE user_log (
user_id VARCHAR,
item_id VARCHAR,
category_id VARCHAR,
behavior VARCHAR,
ts TIMESTAMP(3)
) WITH (
'connector' = 'kafka',
'topic' = 'user_behavior',
'properties.bootstrap.servers' = 'kafka-1:9092,kafka-2:9092,kafka-3:9092',
'properties.group.id' = 'testGroup',
'scan.startup.mode' = 'earliest-offset',
'format' = 'json'
);

CREATE TABLE pvuv_sink (
dt VARCHAR primary key,
pv BIGINT,
uv BIGINT
) WITH (
'connector' = 'jdbc', -- 使用 jdbc connector
'url' = 'jdbc:mysql://test-mysql:3306/test', -- jdbc url
'table-name' = 'pvuv_sink', -- 表名
'username' = 'root', -- 用户名
'password' = '123456' -- 密码
);

INSERT INTO pvuv_sink
SELECT
DATE_FORMAT(ts, 'yyyy-MM-dd HH:00') dt,
COUNT(*) AS pv,
COUNT(DISTINCT user_id) AS uv
FROM user_log
GROUP BY DATE_FORMAT(ts, 'yyyy-MM-dd HH:00');

另外kafka的消息格式

{"user_id": "543462", "item_id":"1715", "category_id": "1464116", "behavior": "pv", "ts":"2021-02-01T01:00:00Z"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01T01:00:00Z"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01T01:00:00Z"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "learning flink", "ts":"2021-02-01T01:00:00Z"}

要修改为

{"user_id": "543462", "item_id":"1715", "category_id": "1464116", "behavior": "pv", "ts":"2021-02-01 01:00:00"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01 01:00:00"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "pv", "ts":"2021-02-01 01:00:00"}
{"user_id": "662867", "item_id":"2244074","category_id":"1575622","behavior": "learning flink", "ts":"2021-02-01 01:00:00"}

否则日志解析失败。


wolfboys commented 2 years ago
  1. window idea environment Could not submit Flink job to remote yarn cluster

动态修改提交job的入口参数, 入口点 YarnClientImpl类 , 方法 submitApplication, 提交点 this.rmClient.submitApplication(request); 对入参 request的 CLASSPATH and _FLINK_CLASSPATH 参数值的分隔符 windows is ";" 替换为 linux is ":"

这个问题应该和官网上报的是一个问题 https://issues.apache.org/jira/browse/FLINK-17858

bulolo commented 1 year ago

请问 2.0.0版本 docker 部署 .env是引用已有mysql,那么数据库 streampark数据库是要手工导入sql吗?还是在docker首次启动的时候会自己创建?

image image
0akarma commented 1 year ago

请问 2.0.0版本 docker 部署 .env是引用已有mysql,那么数据库 streampark数据库是要手工导入sql吗?还是在docker首次启动的时候会自己创建? image image

得自己进入mysql容器,执行sql,默认不会初始化

wolfboys commented 1 year ago
  1. 请问streampark 集成 flinkcdc不,根据日志实时同步功能支持不?

支持, 不论是datastream写的flinkcdc同步的作业还是flinksql 作业都支持, 只要是一个标准的flink作业都支持, 如果是flink sql作业的话, connector 必须是按照flink的规范实现的标准的 flink sql connector, 引入对应的依赖jar或者pom即可.

2000liux commented 1 year ago

compile using this command : mvn clean install -DskipTests -Dcheckstyle.skip -Dmaven.javadoc.skip=true

changeme2012 commented 1 year ago

提交flink sql的任务运行失败,找不到失败原因 image

Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.streampark.flink.client.FlinkClient$.$anonfun$proxy$1(FlinkClient.scala:80) at org.apache.streampark.flink.proxy.FlinkShimsProxy$.$anonfun$proxy$1(FlinkShimsProxy.scala:60) at org.apache.streampark.common.util.ClassLoaderUtils$.runAsClassLoader(ClassLoaderUtils.scala:38) at org.apache.streampark.flink.proxy.FlinkShimsProxy$.proxy(FlinkShimsProxy.scala:60) at org.apache.streampark.flink.client.FlinkClient$.proxy(FlinkClient.scala:75) at org.apache.streampark.flink.client.FlinkClient$.submit(FlinkClient.scala:49) at org.apache.streampark.flink.client.FlinkClient.submit(FlinkClient.scala) at org.apache.streampark.console.core.service.impl.ApplicationServiceImpl.lambda$start$10(ApplicationServiceImpl.java:1544) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ... 3 more Caused by: java.lang.NoSuchFieldError: CANCEL_ENABLE at org.apache.streampark.flink.client.trait.FlinkClientTrait.submit(FlinkClientTrait.scala:102) at org.apache.streampark.flink.client.trait.FlinkClientTrait.submit$(FlinkClientTrait.scala:63) at org.apache.streampark.flink.client.impl.YarnApplicationClient$.submit(YarnApplicationClient.scala:44) at org.apache.streampark.flink.client.FlinkClientHandler$.submit(FlinkClientHandler.scala:40) at org.apache.streampark.flink.client.FlinkClientHandler.submit(FlinkClientHandler.scala)

Upgrading the version to 2.1.1

3yekn1 commented 1 year ago

image

  1. 可以通过修改依赖 为 provided解决。
  2. 是否还有其他手段解决,例如fat jar?
  3. 为什么修改child-first 与 parent-first选项未生效?
caicancai commented 1 year ago

An error is reported when the source code is compiled

image

At present, you can comment these two files to compile, then uncomment and compile again.You can try it.

liyichencc commented 1 year ago
  1. after streamx-console started, app.home is not set, and throw NullPointerException

image

streamx-console initialization check failed. If started local for development and debugging, please ensure the -Dapp.home parameter is clearly specified in vm options, more detail: http://www.streamxhub.com/docs/user-guide/development/#vm-options

http://www.streamxhub.com/docs/user-guide/development/#vm-options Link has expired,Now you can refer to this address:https://streampark.apache.org/zh-CN/docs/user-guide/deployment/

wangyg007 commented 1 month ago

局部截取_20240914_151613 dev分支编译本地部署,前端spark applications点击add new报404