The changes involve updating the NonValueTransformer class within the sdgx/data_processors/transformers/nan.py file. Specifically, the drop_na attribute is updated to default to False, indicating that rows with missing values will not be dropped by default. Additionally, a new functionality is introduced to handle a custom fill_na_value passed through kwargs during the fit method. This value must be of type str, and if not, a ValueError is raised.
Motivation and Context
This change is required to enhance the flexibility of the NonValueTransformer class. By allowing users to specify a custom fill value for missing data, the transformer becomes more versatile and useful in scenarios where specific string values are preferred for filling missing data rather than dropping rows.
How has this been tested?
The changes have been tested by running unit tests that cover the fit and convert methods of the NonValueTransformer class.
Types of changes
[ ] Maintenance (no change in code, maintain the project's CI, docs, etc.)
[x] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
Checklist:
[x] My code follows the code style of this project.
[ ] My change requires a change to the documentation.
Description
The changes involve updating the
NonValueTransformer
class within thesdgx/data_processors/transformers/nan.py
file. Specifically, thedrop_na
attribute is updated to default toFalse
, indicating that rows with missing values will not be dropped by default. Additionally, a new functionality is introduced to handle a customfill_na_value
passed throughkwargs
during thefit
method. This value must be of typestr
, and if not, aValueError
is raised.Motivation and Context
This change is required to enhance the flexibility of the
NonValueTransformer
class. By allowing users to specify a custom fill value for missing data, the transformer becomes more versatile and useful in scenarios where specific string values are preferred for filling missing data rather than dropping rows.How has this been tested?
The changes have been tested by running unit tests that cover the
fit
andconvert
methods of theNonValueTransformer
class.Types of changes
Checklist: