Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
Add support for SparkMeasure to the Spark engine for better monitoring of Spark's performance.
Related issues/PRs
Related issues: #5200
How to use
Here is an example of submitting parameters using RESTful API, with the specific parameters listed below:
{ "executionContent": { "code": "select * from test1.test1", "runType": "sql" }, "labels": { "engineType": "spark-3.2.1", "userCreator": "zhangyuyao-IDE" }, "params": { "configuration": { "runtime": { "linkis.sparkmeasure.aggregate.type": "stage" }, "startup": { "linkis.sparkmeasure.flight.recorder.type": "task" } } } }
It is only valid for some DQLs, such as SELECT, INSERT, and CREATE AS SELECT.
Indicators can be categorized into aggregate indicators and detailed indicators, and each type of indicator can further be divided into two sub-categories: stage and task.
For aggregate indicators (using the parameter: linkis.sparkmeasure.aggregate.type), each eligible SQL query will have a separate file output.
For detailed indicators (using the parameter: linkis.sparkmeasure.flight.recorder.type), an engine will only output one file, and the file will be generated when the engine is shut down. It is recommended to only execute one SQL statement per engine when using this feature.
Due to the reuse mechanism of the Linkis engine, using linkis.sparkmeasure.flight.recorder.type does not necessarily result in the creation of a new engine. If an existing engine is reused, it may lead to no indicator file being output.
To ensure that the file is output to the correct path, you should add the following parameter to the Spark engine: linkis.sparkmeasure.output.prefix.
Currently, support is provided for writing files to both local and HDFS storage.
For a detailed introduction to sparkmeasure, please refer to SparkMeasure
Note
The files generated by SparkMeasure Flight Recorder can be relatively large.
What is the purpose of the change
Add support for SparkMeasure to the Spark engine for better monitoring of Spark's performance.
Related issues/PRs
Related issues: #5200
How to use
{ "executionContent": { "code": "select * from test1.test1", "runType": "sql" }, "labels": { "engineType": "spark-3.2.1", "userCreator": "zhangyuyao-IDE" }, "params": { "configuration": { "runtime": { "linkis.sparkmeasure.aggregate.type": "stage" }, "startup": { "linkis.sparkmeasure.flight.recorder.type": "task" } } } }
Note
Checklist