NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

Fix nan label issue in training #1104

Closed leewyang closed 3 weeks ago

leewyang commented 3 weeks ago

This PR fixes a regression in training introduced in #1102.

Changes

  1. Add back code to remove rows w/ NaN labels (but relocated inside the train() method).

Test

Following CMDs have been tested:

Internal Usage:

python qualx_main.py preprocess
python qualx_main.py predict
python qualx_main.py train
python qualx_main.py evaluate
python qualx_main.py compare