NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
43 stars 34 forks source link

Add internal CLI to generate instance descriptions for CSPs #1137

Open cindyyuanjiang opened 1 week ago

cindyyuanjiang commented 1 week ago

Fixes https://github.com/NVIDIA/spark-rapids-tools/issues/1123.

This PR is first step to remove dependency on CSP CLIs. Ideally after we added this CLI, we can generate instance description json files for each platform and store them as resources in tools repo. Then we can add logic to use these instance description files when running tools.

Changes

Added an internal CLI spark_rapids_dev generate_instance_description [options].

Options:

The generated json file has the following format (which is inspired by EMR CLI output):

{
  "instance_name": {
     "VCpuInfo": {
       "DefaultVCpus": xxx
     },
     "MemoryInfo": {
       "SizeInMiB": xxx
     },
     "GpuInfo": {
       "Gpus": [
         {
           "Name": "xxx",
           "Manufacturer": "xxx",
           "Count": xxx,
           "MemoryInfo": {
             "SizeInMiB": xxx
           }
         }
       ]
     }
   },
  ...
}

For CPU instance, the entry for "GpuInfo" will look like "GpuInfo": {}.

Example json file entry for EMR platform
"g5.4xlarge": {
    "VCpuInfo": {
      "DefaultVCpus": 16
    },
    "MemoryInfo": {
      "SizeInMiB": 65536
    },
    "GpuInfo": {
      "Gpus": [
        {
          "Name": "A10G",
          "Manufacturer": "NVIDIA",
          "Count": 1,
          "MemoryInfo": {
            "SizeInMiB": 24576
          }
        }
      ]
    }
  }

Testing

spark_rapids_dev generate_instance_description --platform emr/dataproc/databricks-azure

cindyyuanjiang commented 1 week ago

I want to discuss: is it a good approach to add a new spark_rapids_dev CLI or should we keep it under spark_rapids?

mattahrens commented 1 week ago

I would rather keep it under the spark_rapids CLI instead of adding a new one. Is there a way to add the command without exposing it to users since it should be internal?

tgravescs commented 1 week ago

what would be the goal of keeping it under the same cli an end user would use without having a useful info message for anyone including dev to see?

amahussein commented 1 week ago

I would rather keep it under the spark_rapids CLI instead of adding a new one. Is there a way to add the command without exposing it to users since it should be internal?

@mattahrens In our CLI, it is not possible to hide a cmd/argument.

amahussein commented 1 week ago

I want to discuss: is it a good approach to add a new spark_rapids_dev CLI or should we keep it under spark_rapids?

Thanks @cindyyuanjiang ! I will take a look at the changes.

mattahrens commented 6 days ago

I would rather keep it under the spark_rapids CLI instead of adding a new one. Is there a way to add the command without exposing it to users since it should be internal?

@mattahrens In our CLI, it is not possible to hide a cmd/argument.

OK, then we can have a dev CLI then for separation.