NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
53 stars 37 forks source link

[BUG] Scalable solution for output files location in the console output #1264

Open parthosa opened 2 months ago

parthosa commented 2 months ago

Currently, the console output shows the following lines to indicate location of important files:

    - Summarized savings and speedups CSV report: /output/qual_20240805225850_C3c0aA4E/qualification_summary.csv
    - Intermediate output generated by tools: /output/qual_20240805225850_C3c0aA4E/intermediate_output
    - Metadata file with cluster recommendation and tuning details: /output/qual_20240805225850_C3c0aA4E/app_metadata.json
    - Application status report: /output/qual_20240805225850_C3c0aA4E/rapids_4_spark_qualification_output/rapids_4_spark_qualification_output_status.csv

Comments

amahussein commented 2 months ago

Thanks @parthosa ! The name is very generic. Can we change the issue title to be more specific on what we are trying to do here?

That's tricky. It has some sort of personal styling and preferences.

parthosa commented 2 months ago

Thanks @amahussein. Yes I agree we need a scalable way to address the problem of output files and their display in the console. List of important files might keep getting longer making console output more cluttered.

Based on this and other offline discussions, we could have a results_metadata.json that contains both outputFiles and appResults entry? (attached a sample)

File: `results_metadata.json`
``` { "outputFiles": [ { "fileName": "qualification_summary.csv", "description": "Summary of the qualification tool run.", "path": "/path/qual_20240805225850_C3c0aA4E/qualification_summary.csv" }, { "fileName": "rapids_4_spark_qualification_output_status.csv", "description": "Status of applcations that were processed by the qualification tool.", "path": "/path/qual_20240805225850_C3c0aA4E/rapids_4_spark_qualification_output/rapids_4_spark_qualification_output_status.csv" } ], "appResults": [ { "appId": "app-20240311074805-0000", "appName": "test_app_xxxxx", "eventLog": "file:/path/to/log", "clusterInfo": { "platform": "dataproc", "sourceCluster": { "driverNodeType": "n1-standard-16", "workerNodeType": "n1-standard-8", "numWorkerNodes": 9 }, "recommendedCluster": { "driverNodeType": "n1-standard-16", "workerNodeType": "n1-standard-32", "numWorkerNodes": 9, "gpuInfo": { "device": "nvidia-tesla-t4", "gpuPerWorker": 4 }, "ssdInfo": { "numLocalSsds": 2 } } }, "estimatedGpuSpeedupCategory": "Medium", "fullClusterConfigRecommendations": "/tools-run/qual_20240805222947_F2b32E83/rapids_4_spark_qualification_output/tuning/app-20240311074805-0000.conf", "gpuConfigRecommendationBreakdown": "/tools-run/qual_20240805222947_F2b32E83/rapids_4_spark_qualification_output/tuning/app-20240311074805-0000.log" },{ "appId": "app-20240311074805-0000", "appName": "test_app_xxxxx", "eventLog": "file:/path/to/log", "clusterInfo": { "platform": "dataproc", "sourceCluster": { "driverNodeType": "n1-standard-16", "workerNodeType": "n1-standard-8", "numWorkerNodes": 9 }, "recommendedCluster": { "driverNodeType": "n1-standard-16", "workerNodeType": "n1-standard-32", "numWorkerNodes": 9, "gpuInfo": { "device": "nvidia-tesla-t4", "gpuPerWorker": 4 }, "ssdInfo": { "numLocalSsds": 2 } } }, "estimatedGpuSpeedupCategory": "Medium", "fullClusterConfigRecommendations": "/tools-run/qual_20240805222947_F2b32E83/rapids_4_spark_qualification_output/tuning/app-20240311074805-0000.conf", "gpuConfigRecommendationBreakdown": "/tools-run/qual_20240805222947_F2b32E83/rapids_4_spark_qualification_output/tuning/app-20240311074805-0000.log" } ] } ```

Now in the console we would have only two lines:

    - Summarized speedups CSV report: /output/qual_20240805225850_C3c0aA4E/qualification_summary.csv
    - Additional information about output files and qualified apps: /output/qual_20240805225850_C3c0aA4E/results_metadata.json`