apache / flink-cdc

Flink CDC is a streaming data integration tool
https://nightlies.apache.org/flink/flink-cdc-docs-stable
Apache License 2.0
5.71k stars 1.94k forks source link

java.lang.NoClassDefFoundError: com/ververica/cdc/connectors/shaded/org/apache/commons/collections/map/LinkedMap #3663

Closed tttzzzwww closed 3 weeks ago

tttzzzwww commented 3 weeks ago

flink cdc 2.4 本地自行打一个fat jar包,执行时遇到一个问题: `java.lang.NoClassDefFoundError: com/ververica/cdc/connectors/shaded/org/apache/commons/collections/map/LinkedMap

at com.ververica.cdc.debezium.DebeziumSourceFunction.<init>(DebeziumSourceFunction.java:146)
at com.ververica.cdc.connectors.oracle.OracleSource$Builder.build(OracleSource.java:195)
at com.dlink.cdc.oracle.OracleCDCBuilder.build(OracleCDCBuilder.java:115)
at com.dlink.trans.ddl.CreateCDCSourceOperation.build(CreateCDCSourceOperation.java:200)
at com.dlink.interceptor.FlinkInterceptor.build(FlinkInterceptor.java:54)
at com.dlink.executor.Executor.pretreatExecute(Executor.java:230)
at com.dlink.executor.Executor.executeSql(Executor.java:243)
at com.flinkbi.FlinkBiSync.flinkJobSubmit(FlinkBiSync.java:153)`
  1. 刚开始以为只要将这个包补充即可,参考:https://github.com/apache/flink-cdc/pull/2501
  2. 补充后,结果又遇到异常Caused by: java.lang.ClassCastException: cannot assign instance of com.ververica.cdc.connectors.shaded.org.apache.commons.collections.map.LinkedMap to field com.ververica.cdc.debezium.DebeziumSourceFunction.pendingOffsetsToCommit of type org.apache.commons.collections.map.LinkedMap in instance of com.ververica.cdc.debezium.DebeziumSourceFunction at java.base/java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2205) at java.base/java.io.ObjectStreamClass$FieldReflector.checkObjectFieldValueTypes(ObjectStreamClass.java:2168) at java.base/java.io.ObjectStreamClass.checkObjFieldValueTypes(ObjectStreamClass.java:1422) at java.base/java.io.ObjectInputStream.defaultCheckFieldValues(ObjectInputStream.java:2450) at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2357) at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2166) at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668) at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2434) at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2328) at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2166) at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668) at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2434) at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2328) at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2166) at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1668) at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:482) at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:440) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:617) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:602) at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:589) at org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:543) at org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:383) ... 9 more 通过此异常,我考虑下应该不是补充缺失的类的问题,应该排查为什么会缺失,整个pom.xml配置并没有对org.apache.commons.collection的引用路径变更,我猜测是某个配置影响到,从而导致fat jar 关于 org.apache.commons.collection的引用增加了前缀: com/ververica/cdc/connectors/shaded/org/apache/commons/collections/map/LinkedMap。

结果不断的尝试,我发现是 flink-sql-connector-oceanbase-cdc项目的pom.xml内影响到了,例如: image maven-shade-plugin插件中的relocation配置有个:org.apache.commons,意味着:这个配置将匹配所有以 org.apache.commons 开头的包,并将其重定位到 com.ververica.cdc.connectors.shaded.org.apache.commons。因此,这个配置会将所有以 org.apache.commons 开头的包都进行重定位,包括 org.apache.commons.collections

错误的生成效果: image org.apache.commons.collections相关代码也被涉及到了,导致这部分代码package过程中引用路径被变更了,实际上这部分代码并不需要更改。

测试去掉这个配置后,生成的fat jar就是正常的,例如: image

咨询:flink-sql-connector-oceanbase-cdc中的关于org.apache.commons 开头的包都进行重定位是否存在问题,是否应该更加具体到需要变更的包路径

tttzzzwww commented 3 weeks ago

我改了sql-oceanbase下内部relocations的配置,例如: 原逻辑: `

org.apache.commons
<shadedPattern>
    com.ververica.cdc.connectors.shaded.org.apache.commons
</shadedPattern>

`

改成后的逻辑 `

org.apache.commons.lang3 com.ververica.cdc.connectors.shaded.org.apache.commons.lang3 org.apache.commons.codec com.ververica.cdc.connectors.shaded.org.apache.commons.codec

` 项目目前仅处理了org.apache.commons.lang3和org.apache.commons.codec,故写全需要处理的包路径即可,避免模糊匹配到其他不需要处理的jar包,从而导致异常。