Open jack86596 opened 3 years ago
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6042/
Build Failed with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/432/
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4298/
Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6043/
Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4299/
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/433/
retest this please
Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/6044/
Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/4300/
Build Success with Spark 3.1, Please check CI http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_3.1/434/
@jack86596 this is behavioral and functional change. So instead of directly raising PR with more code changes, better to first raise discussion in community and take the inputs and then do the changes accordingly.
OK, i will raise a discussion in the mail list.
Why is this PR needed?
Currently clean files command will delete all the Marked for Delete and Compacted segments after the number of theses segments reaches carbon.invisible.segments.preserve.count, this delete operation may take lots of time and user cannot decide to only delete some of these segments. It is better to enhance clean files command to allow specify the segments to be deleted.
What changes were proposed in this PR?
Refactoring lock taken: during clean files, take the tablestatus lock at the begining and release the lock at the end, and during lock taken period, only read tablestatus file one time(before there could be 10+) and all operations are done on it like change the visibility of segment, move visibility = false segment to tablestatus.history file.
Does this PR introduce any user interface change?
Is any new testcase added?