Closed jack1981 closed 2 years ago
Hi @jack1981, I believe your question fits better in the context of spark-dashboard implementation with the Spark metrics system, as described in https://github.com/LucaCanali/Miscellaneous/tree/master/Spark_Dashboard and in https://github.com/cerndb/spark-dashboard In that case, I'd liek to share that I have been working on extensions of Spark monitoring to cover S3A and other I/O and OS metrics, for the case of Spark 3.0, please see the work https://github.com/cerndb/SparkPlugins I'll be interested in collecting the feedback. Best, L.
We are using S3 compatible Object Storage for Spark Storage but the current default IO metrics support hdfs only.
SELECT non_negative_derivative("value", 1s) FROM "filesystem.hdfs.read_bytes" WHERE "applicationid" = '$ApplicationId' AND $timeFilter GROUP BY process
Is there anyway can able to fetch other distribute file system IO metrics ?
Thanks !