apache / carbondata

High performance data store solution
carbondata.apache.org
Apache License 2.0
1.43k stars 703 forks source link

Integration with Spark #4296

Open vajaw opened 1 year ago

vajaw commented 1 year ago

Will Carbonata be integrated with Spark in the future? Can Spark version 3.1.2 be integrated with Carbonata

chenliang613 commented 1 year ago

Yes, the community is considering spark 3.3

vajaw commented 1 year ago

Comparing Spark3.1.1 and Spark3.1.2, the parameter list for the writeAndRead method in the DataSource class has increased from 4 to 5. The link between the two versions of code is as follows: https://github.com/apache/spark/blob/v3.1.2/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala https://github.com/apache/spark/blob/v3.1.1/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala Due to modifications to the writeAndRead method of the DataSource class in Spark3.1.2, this will result in issues with integrating Carbondata2.3.0 with Spark3.1.2. The reason for the problem is that CarbonReflectionUtils in carbondata2.3.0 references Spark's writeAndRead method. The link for CarbonReflectionUtils is as follows: https://github.com/apache/carbondata/blob/branch-2.3/integration/spark/src/main/scala/org/apache/spark/util/CarbonReflectionUtils.scala Will the future community integrate higher versions of Spark3.1.2 or Spark3.1.x