[X] I have searched in the issues and found no similar issues.
Describe the feature
Leverage the Spark DSv2 API to implement a connector that provides a SQL interface to access the YARN agg logs, and maybe other YARN resources in the future.
Motivation
For large-scale Spark on YARN deployments, there are dozens or even hundreds of thousands of Spark applications submitted to a cluster per day, and the app logs are collected and aggregated by YARN stored on HDFS, sometimes we might want to analyze the logs to identify some cluster-level issues, for example, some machine might have hardware issues that frequently produce disk/network exceptions, it's straightforward to leverage Spark to analyze those logs in parallel.
Describe the solution
the usage might be like
$ spark-sql --conf spark.sql.catalog.yarn=org.apache.kyuubi.spark.connector.yarn.YarnCatalog
> SELECT
app_id, app_attempt_id,
app_start_time, app_end_time,
container_id, host,
file_name, line_num, message
FROM yarn.agg_logs
WHERE app_id = 'application_1234'
AND container_id='container_12345'
AND host = 'hadoop123.example.com'
Additional context
No response
Are you willing to submit PR?
[ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
Code of Conduct
Search before asking
Describe the feature
Leverage the Spark DSv2 API to implement a connector that provides a SQL interface to access the YARN agg logs, and maybe other YARN resources in the future.
Motivation
For large-scale Spark on YARN deployments, there are dozens or even hundreds of thousands of Spark applications submitted to a cluster per day, and the app logs are collected and aggregated by YARN stored on HDFS, sometimes we might want to analyze the logs to identify some cluster-level issues, for example, some machine might have hardware issues that frequently produce disk/network exceptions, it's straightforward to leverage Spark to analyze those logs in parallel.
Describe the solution
the usage might be like
Additional context
No response
Are you willing to submit PR?