apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.44k stars 959 forks source link

[spark] remove the dependency of paimon-bundle in all submodules of paimon-spark. #4352

Closed liming30 closed 1 month ago

liming30 commented 1 month ago

Purpose

Linked issue: close #4351

Remove the dependency of paimon-bundle in all submodules of paimon-spark to fix the problem that the paimon class cannot be found in idea.

Tests

API and Format

Documentation

wwj6591812 commented 1 month ago

+1

liming30 commented 1 month ago

@JingsongLi @Zouxxyy can I merge this PR?

Zouxxyy commented 1 month ago

@JingsongLi @Zouxxyy can I merge this PR?

Sorry for delay, let me test it on my idea 2024

Zouxxyy commented 1 month ago

I test MigrateFileProcedureTest on my IDEA and it throw this:

An exception or error caused a run to abort: COMPRESSION_ZSTD_LEVEL 
java.lang.NoSuchFieldError: COMPRESSION_ZSTD_LEVEL
    at org.apache.paimon.format.orc.OrcFileFormat.getOrcProperties(OrcFileFormat.java:152)
    at org.apache.paimon.format.orc.OrcFileFormat.<init>(OrcFileFormat.java:75)
    at org.apache.paimon.format.orc.OrcFileFormatFactory.create(OrcFileFormatFactory.java:35)
    at org.apache.paimon.format.orc.OrcFileFormatFactory.create(OrcFileFormatFactory.java:24)
    at org.apache.paimon.format.FileFormat.fromIdentifier(FileFormat.java:106)
    at org.apache.paimon.format.FileFormat.fromIdentifier(FileFormat.java:91)
liming30 commented 1 month ago

I test MigrateFileProcedureTest on my IDEA and it throw this:

An exception or error caused a run to abort: COMPRESSION_ZSTD_LEVEL 
java.lang.NoSuchFieldError: COMPRESSION_ZSTD_LEVEL
  at org.apache.paimon.format.orc.OrcFileFormat.getOrcProperties(OrcFileFormat.java:152)
  at org.apache.paimon.format.orc.OrcFileFormat.<init>(OrcFileFormat.java:75)
  at org.apache.paimon.format.orc.OrcFileFormatFactory.create(OrcFileFormatFactory.java:35)
  at org.apache.paimon.format.orc.OrcFileFormatFactory.create(OrcFileFormatFactory.java:24)
  at org.apache.paimon.format.FileFormat.fromIdentifier(FileFormat.java:106)
  at org.apache.paimon.format.FileFormat.fromIdentifier(FileFormat.java:91)

@Zouxxyy There is a version conflict with orc-core. When I exclude it, all the tests in idea can run successfully. But when I use mvn to execute in the terminal, an exception occurs that orc classes cannot be found. Do you have any other suggestions?

Zouxxyy commented 1 month ago

@Zouxxyy There is a version conflict with orc-core. When I exclude it, all the tests in idea can run successfully. But when I use mvn to execute in the terminal, an exception occurs that orc classes cannot be found. Do you have any other suggestions?

@liming30 In fact, I have no other suitable solutions yet. Can you try 1) remove paimon-bundle, 2) exclude orc or parquet deps in spark deps 3) add orc or parquet deps with paimon used version

Zouxxyy commented 1 month ago

@liming30 Can you try this https://github.com/Zouxxyy/incubator-paimon/commit/05d3a86936704d5138e713b33bd93d43c829b116

liming30 commented 1 month ago

@liming30 Can you try this Zouxxyy@05d3a86

Thanks for your help, I'll try this again later.

liming30 commented 1 month ago

@Zouxxyy There are still some tests that fail in idea, such as org.apache.paimon.spark.procedure.MigrateTableProcedureTest. I will try to solve this problem next.

askwang commented 1 month ago

@Zouxxyy There are still some tests that fail in idea, such as org.apache.paimon.spark.procedure.MigrateTableProcedureTest. I will try to solve this problem next.

@liming30 I tested it ok in my idea, maybe you need rebuild the project

YannByron commented 1 month ago

That works. +1.