Granulate / gprofiler

gProfiler is a system-wide profiler, combining multiple sampling profilers to produce unified visualization of what your CPU is spending time on.
https://profiler.granulate.io
Apache License 2.0
741 stars 54 forks source link

WIP: Spark --app-id as container name #871

Open Jongy opened 6 months ago

Jongy commented 6 months ago

Description

This is a redo of the branch spark-app-id-as-container-name (from cf74924629b4db32c2a9d78abb89e1151037a5e6).

The idea is to allow using Spark application IDs as "container names" collected by gProfiler. Logically, a "container name" is just an identifier describing a process or a set of processes across (potentially) multiple nodes. For Docker/k8s/ecs etc, container names are logical. In Spark/YARN, application IDs or YARN application names can be used instead.

I turned ContainerNamesClient to a generic base ContainerNamesClientBase which is implemented by ContainerNamesClient (the old) and SparkContainerNamesClient (the new).

Related Issue

https://github.com/Granulate/gprofiler/issues/869

How Has This Been Tested?

T.B.D