NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
56 stars 37 forks source link

[FEA] Support Databricks fleet clusters #1422

Open amahussein opened 1 week ago

amahussein commented 1 week ago

Is your feature request related to a problem? Please describe.

This is followup on a question posted earlier #1393. The tools detected that the instance type of the worker is rd-fleet.8xlarge which is in correct. For customers running on Fleet clusters we need to decide on how to support that:

  1. Challenge: Workers can be assigned different instance types (which does not work for our model that assumes homogeneous configurations)
  2. We also need to investigate if it is possible to find the specific instance-type assigned to the workers. In that case, the tools should return something similar to r[0-9]d.8xlarge instead of rd-fleet.8xlarge

Additional context