Raise more detailed exception when encountering invalid feature values

bnaul commented 8 years ago

Fixes #92. Combined with cesium-ml/cesium_web#109 this should also make debugging failures on the frontend easier.

bnaul commented 8 years ago

Yeah, it's pd.DataFrame.any which by default is column-wise.

On Thu, Oct 6, 2016 at 3:57 PM acrellin notifications@github.com wrote:

@acrellin commented on this pull request.

In cesium/build_model.py https://github.com/cesium-ml/cesium/pull/195#pullrequestreview-3221129:

if params_to_optimize:

model = fit_model_optimize_hyperparams(feature_df, featureset['target'],

model, params_to_optimize, cv)

else:

model.fit(feature_df, featureset['target'])

try:

if params_to_optimize:

model = fit_model_optimize_hyperparams(feature_df, featureset['target'],

model, params_to_optimize, cv)

else:

model.fit(feature_df, featureset['target'])

except ValueError as e:

nan_feats = np.isnan(feature_df).any()

inf_feats = np.isinf(feature_df).any()

message = "Invalid feature values detected:"

if nan_feats.any():

Does the call to any() above return an array of values?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cesium-ml/cesium/pull/195#pullrequestreview-3221129, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3J5xO3YQgC5C7-IB-YERnW51Xawu5Uks5qxXzkgaJpZM4KQfsr .

acrellin commented 8 years ago

Looks good @bnaul :+1:

bnaul commented 8 years ago

Huh?

On Thu, Oct 6, 2016 at 4:08 PM Stefan van der Walt notifications@github.com wrote:

@stefanv commented on this pull request.

In cesium/tests/test_build_model.py https://github.com/cesium-ml/cesium/pull/195#pullrequestreview-3222270:

assert isinstance(model, RandomForestClassifier) + + +def test_invalid_feature_values():

"""Test proper exception handling for invalid feature values"""

fset = sample_featureset(10, 1, ['x_valid', 'x_inf', 'x_nan'], ['class1', 'class2'])

fset.x_inf.values[0, 0] = np.inf

fset.x_nan.values[0, 0] = np.nan

model = build_model.MODELS_TYPE_DICT['RandomForestClassifier']()

try:

model = build_model.build_model_from_featureset(fset, model)

except ValueError as e:

assert 'x_valid' not in str(e)

assert 'x_inf' in str(e)

assert 'x_nan' in str(e) +

Else?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cesium-ml/cesium/pull/195#pullrequestreview-3222270, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3J5_wOXF5yk0E1pxNlxSoSyAwzoMcvks5qxX9UgaJpZM4KQfsr .

acrellin commented 8 years ago

I think he means make sure that exception actually gets raised - if not, that test will pass.

bnaul commented 8 years ago

Ah, smart...how about now?

acrellin commented 8 years ago

:+1:

stefanv commented 8 years ago

🚂

cesium-ml / cesium

Raise more detailed exception when encountering invalid feature values #195

@acrellin commented on this pull request.

@stefanv commented on this pull request.