Open ShreelekhyaG opened 2 years ago
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/772/
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6386/
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4641/
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6387/
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4642/
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/773/
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6390/
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4645/
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/776/
LGTM
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6393/
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4650/
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/779/
retest this please
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6395/
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4652/
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/781/
retest this please
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4653/
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/782/
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6396/
LGTM
retest this please
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6397/
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4654/
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/783/
Build Failed with Spark 2.4.5, Please check CI http://159.138.8.58:12602/job/ApacheCarbon_PR_Builder_2.4.5/4659/
Why is this PR needed?
Performance degradation for Incremental updates is observed in the partition table.
fileNameToMetaInfoMapping
map. On incremental update for partition table, the number of invalid files keep on increasing each time which is causing the degradation increateCarbonDataFileBlockMetaInfoMapping
method.fileNameToMetaInfoMapping
map. The number of invalid files keeps on increasing with each update which is causing the degradation in creatingfileNameToMetaInfoMapping
map.Invalid segments cache is not removed after delete/update.
What changes were proposed in this PR?
Instead of listing files, made a change to get the carbon file from the file name and create BlockMetaInfo directly in
createBlockMetaInfo
. Impact when tested on a single partition with 100 segments:select count(*)
operation. Because inselect count(*)
flow it was listing files for each segment and the map was not reused. Impact when tested on a non-partition table with 100 segments:Clearing invalid/deleted segments from cache after delete and update.
Does this PR introduce any user interface change?
Is any new testcase added?