shenweichen / DeepCTR-Torch

【PyTorch】Easy-to-use,Modular and Extendible package of deep-learning based CTR models.
https://deepctr-torch.readthedocs.io/en/latest/index.html
Apache License 2.0
3.02k stars 705 forks source link

when prediction value is very small, metric logloss calculate error #55

Closed alexdyysp closed 4 years ago

alexdyysp commented 5 years ago

Describe the bug(问题描述) /usr/local/anaconda3/lib/python3.7/site-packages/sklearn/metrics/classification.py:2174: RuntimeWarning: divide by zero encountered in log loss = -(transformed_labels * np.log(y_pred)).sum(axis=1)

To Reproduce(复现步骤) Steps to reproduce the behavior:

model = xDeepFM(linear_feature_columns,dnn_feature_columns,task='binary',device=device)
model.compile("adam", "binary_crossentropy",
              metrics=['log_loss'], )

Operating environment(运行环境):

Additional context my model and data will predict very small value like 0.0001 and value close to 1 like 0.9998 even appear 0 and 1. so it will make skearn metrci caculate error.

wzh9969 commented 5 years ago

Same problem. It is caused by different default float types between pytorch (float32) and numpy/sklearn (float64). I solved by turning the model output to float64 (.astype("float64")). I modified two places in BaseModel.fit and BaseModel.predict methods and the problem is solved in my case.

fit():
    train_result[name].append(metric_fun(
                                    y.cpu().data.numpy(), y_pred.cpu().data.numpy().astype("float64")))
predict():
    return np.concatenate(pred_ans).astype("float64")