The purpose of this enhancement is to create a qualification tool that analyzes customer event files to determine which workloads are suitable for execution with Gluten. This is crucial when onboarding new customers, as not all workloads benefit from Gluten's native acceleration—especially workloads with RDD operations, unsupported SQL operators or UDF workloads.
Proposed Solution:
Develop a Java program to analyze the event files, given a Hadoop file path as input. The program will generate two reports:
Application Report:
Percentage of RDD usage
Percentage of Unsupported SQL operations
Percentage of supported SQL operations
Cumulative task time for each application
Recommendation to use Gluten Acceleration (Recommended if Percentage of supported SQL operations >= 70%)
Unsupported Operator Report:
Unsupported SQL operators
Impact on cumulative CPU time
Requirements:
Compatibility with Hadoop file paths that point to:
Description
The purpose of this enhancement is to create a qualification tool that analyzes customer event files to determine which workloads are suitable for execution with Gluten. This is crucial when onboarding new customers, as not all workloads benefit from Gluten's native acceleration—especially workloads with RDD operations, unsupported SQL operators or UDF workloads.
Proposed Solution:
Develop a Java program to analyze the event files, given a Hadoop file path as input. The program will generate two reports:
Requirements: