apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.23k stars 2.39k forks source link

[SUPPORT] how to use hudi in cdp 7.1.7 #8141

Closed MarlboroBoy closed 1 year ago

MarlboroBoy commented 1 year ago

Tips before filing an issue

Describe the problem you faced

A clear and concise description of the problem.

To Reproduce

Steps to reproduce the behavior:

1. 2. 3. 4.

Expected behavior

A clear and concise description of what you expected to happen.

Environment Description

Additional context

Add any other context about the problem here.

Stacktrace

Add the stacktrace of the error.

danny0405 commented 1 year ago

Hi, can you give more details here?

MarlboroBoy commented 1 year ago

Hi,I use hudi 0.13 on cdp7.1.7 with spark 3.2,When I use the spark.sql to create a new table raise an exception.I want to ask if it is necessary to compile hudi separately if it is used on cdp。

error msg is :

scala> spark.sql("create table hudi_cow_pt_tbl ( id bigint, name string, ts bigint, dt string, hh string ) using hudi tblproperties ( type = 'cow', primaryKey = 'id', preCombineField = 'ts' ) partitioned by (dt, hh) location '/tmp/hudi/hudi_cow_pt_tbl'") org.apache.spark.sql.AnalysisException: Multiple sources found for hudi (org.apache.hudi.Spark3DefaultSource, org.apache.hudi.Spark32PlusDefaultSource), please specify the fully qualified class name.

at org.apache.spark.sql.errors.QueryCompilationErrors$.findMultipleDataSourceError(QueryCompilationErrors.scala:1049) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:697) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:720) at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.org$apache$spark$sql$catalyst$analysis$ResolveSessionCatalog$$isV2Provider(ResolveSessionCatalog.scala:636) at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:165) at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog$$anonfun$apply$1.applyOrElse(ResolveSessionCatalog.scala:48) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:111) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:110) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:48) at org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog.apply(ResolveSessionCatalog.scala:42) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:89)

danny0405 commented 1 year ago

Multiple sources found for hudi (org.apache.hudi.Spark3DefaultSource, org.apache.hudi.Spark32PlusDefaultSource), please specify the fully qualified class name.

Seems it is an issues of the user action.

MarlboroBoy commented 1 year ago

![Uploading WX20230309-193057.png…]()

ad1happy2go commented 1 year ago

@MarlboroBoy Normally this happens when more than one hudi jar is in class path. In this case it looks like both the build i.e. with spark 3 and with spark 3.2 are added to the class path. So this is giving this error.

Were you able to resolve same? if yes can we close this issue.