NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
56 stars 38 forks source link

Add qualx support for platform runtime variants (DB AWS) #1417

Closed leewyang closed 1 week ago

leewyang commented 1 week ago

This PR follows #1414 to support variants of platform runtimes (e.g. photon) in the qualx models.

Model variants are delimited by the underscore character '', i.e. . This PR also updates the models from the latest code and datasets, including the new databricks-aws_photon model.

Changes

  1. adds sparkRuntime as a new expected_raw_feature.
  2. uses the sparkRuntime column to detect datasets with mixed runtimes (i.e. more than one).
  3. modifies the prediction loop to operate on input rows grouped by runtime variant.

Test

Following CMDs have been tested:

spark_rapids prediction

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py train
python qualx_main.py evaluate
parthosa commented 1 week ago

Thanks, @leewyang! This PR introduces Photon models for Databricks on AWS. Could we update the title to reflect “DB AWS”?

In future, when we add support for DB Azure models, this will help provide clarity.

tgravescs commented 1 week ago

so DB AWS here is in the title because we use a model training on eventlogs from databricks aws? What does it do on azure or gcp? Are we expecting differences there? I would expect the databricks code to the same but you have different machine types and possibly I/O characteristics so I'm wondering if we have seen differences from those.

parthosa commented 1 week ago

so DB AWS here is in the title because we use a model training on eventlogs from databricks aws?

Yes, it adds a model trained on DB AWS Photon event logs.

What does it do on azure or gcp? Are we expecting differences there?