[CARBONDATA-4274] Fix create partition table error with spark 3.1

apache / carbondata

High performance data store solution

carbondata.apache.org

Apache License 2.0

1.43k stars 704 forks source link

[CARBONDATA-4274] Fix create partition table error with spark 3.1 #4208

Closed ShreelekhyaG closed 3 years ago

ShreelekhyaG commented 3 years ago

Why is this PR needed?

With spark 3.1, we can create a partition table by giving partition columns from schema. Like below example: create table partitionTable(c1 int, c2 int, v1 string, v2 string) stored as carbondata partitioned by (v2,c2)

When the table is created by SparkSession with CarbonExtension, catalog table is created with the specified partitions. But in cluster/ with carbon session, when we create partition table with above syntax it is creating normal table with no partitions.

What changes were proposed in this PR?

partitionByStructFields is empty when we directly give partition column names. So it was not creating a partition table. Made changes to identify the partition column names and get the struct field and datatype info from table columns.

Does this PR introduce any user interface change?

No

Is any new testcase added?
Yes, tested in cluster.

CarbonDataQA2 commented 3 years ago

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/277/

CarbonDataQA2 commented 3 years ago

Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5874/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4132/

MarvinLitt commented 3 years ago

how does the CI pass under spark3.1? the partition table test case should work to take care of this issue, right?

ShreelekhyaG commented 3 years ago

how does the CI pass under spark3.1? the partition table test case should work to take care of this issue, right?

Prior to this, there is no such test case in CI with syntax like below, where we can create a partition table by giving partition column names from schema. create table partitionTable(c1 int, c2 int, v1 string, v2 string) stored as carbondata partitioned by (v2,c2)

With 2.3 and 2.4 versions, the above syntax will fail while parsing and only the syntax with partition column names and datatype is valid like below. create table partitionTable(c1 int, v1 string) stored as carbondata partitioned by (v2 string,c2 int)

spark 3.1 supports creation of partition table with both of the above syntax types.

CarbonDataQA2 commented 3 years ago

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/286/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5884/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4141/

CarbonDataQA2 commented 3 years ago

Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/288/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5886/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4143/

ShreelekhyaG commented 3 years ago

retest this please

CarbonDataQA2 commented 3 years ago

Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/289/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5887/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4144/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/294/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5893/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4149/

kunal642 commented 3 years ago

LGTM

kunal642 commented 3 years ago

retest this please

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5903/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/303/

CarbonDataQA2 commented 3 years ago

Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4158/

apache / carbondata

[CARBONDATA-4274] Fix create partition table error with spark 3.1 #4208

Why is this PR needed?

What changes were proposed in this PR?

Does this PR introduce any user interface change?

Is any new testcase added?