dreamquark-ai / tabnet

PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
https://dreamquark-ai.github.io/tabnet/
MIT License
2.55k stars 470 forks source link

Running error: all loss is 0 when using GridSearchCV and patience=10 no use #468

Closed statmasterY closed 1 year ago

statmasterY commented 1 year ago

I use the loss function or scorer I defined, but it does not work.

this is my loss function:

def ccc_loss(y_true, y_pred):
    y_true = np.array(y_true).reshape(-1, 1)
    y_pred = np.array(y_pred).reshape(-1, 1)
    y_true_mean = np.mean(y_true)
    y_pred_mean = np.mean(y_pred)
    y_true_std = np.std(y_true)
    y_pred_std = np.std(y_pred)
    cov = np.mean((y_true - y_true_mean) * (y_pred - y_pred_mean))
    rho = cov / (y_true_std * y_pred_std)
    ccc = 2 * rho * y_true_std * y_pred_std / (
    y_true_std ** 2 + y_pred_std ** 2 + (y_true_mean - y_pred_mean) ** 2
    )
    return ccc

ccc_scorer = make_scorer(ccc_loss, greater_is_better=True)

and this is my training code:

# 设置折时序交叉验证
tscv = TimeSeriesSplit(n_splits=2)
tabnet_params = {
    "n_d": [16],
    "n_a": [16],
    "n_steps": [3],
    "gamma": [1.0],
    "n_independent": [1],
    "n_shared": [1],
}
tabnet_models = []
for i in range(1):
    # early stopping =10
    tabnet = TabNetRegressor()
    grid_search = GridSearchCV(
    tabnet,
    tabnet_params,
    scoring=ccc_scorer,
    cv=tscv,
    n_jobs=-1,
    verbose=1,
    pre_dispatch="2*n_jobs",
    )
    grid_search.fit(
    X.values,
    y.values.reshape(-1, 1),
    max_epochs=200,
    patience=10,
    batch_size=1024,
    virtual_batch_size=128,
    )
    tabnet_models.append(grid_search.best_estimator_)

the output is

[d:\Program](file:///D:/Program) Files (x86)\Python\lib\site-packages\pytorch_tabnet\abstract_model.py:75: UserWarning: Device used : cpu
  warnings.warn(f"Device used : {self.device}")
Output exceeds the [size limit](command:workbench.action.openSettings?%5B%22notebook.output.textLineLimit%22%5D). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?127f0f0d-372c-4e07-9be6-2c362e17cbbc)
Fitting 2 folds for each of 1 candidates, totalling 2 fits
epoch 0  | loss: 0.0     |  0:00:00s
epoch 1  | loss: 0.0     |  0:00:00s
epoch 2  | loss: 0.0     |  0:00:00s
epoch 3  | loss: 0.0     |  0:00:00s
epoch 4  | loss: 0.0     |  0:00:00s
epoch 5  | loss: 0.0     |  0:00:00s

[d:\Program](file:///D:/Program) Files (x86)\Python\lib\site-packages\pytorch_tabnet\abstract_model.py:651: UserWarning: **No early stopping will be performed**, last training weights will be used.
  warnings.warn(wrn_msg)

it tells me that patience =10 does not work and I could not find any good example that combines the GridSearchCV and tabNET online, so I propose my issue here.

Optimox commented 1 year ago

Did you have a look at this issue https://github.com/dreamquark-ai/tabnet/issues/382 ?

statmasterY commented 1 year ago

I have read it and I have solved the coding problem by using the 'for' loop as the code in #382. I just want to know if it supports GridSearchCV, which is thought to be more convenient.

Another question is that I am new to tabNET, it did not work when I add the parameter 'loss_fn=ccc_loss', which might cause the problem that all loss is 0. But I could not deal with it.

this is my updated code of ccc_loss

def ccc_loss(y_true, y_pred):
    if isinstance(y_true, np.ndarray):
        y_true = y_true.reshape(-1, 1)
        y_pred = y_pred.reshape(-1, 1)
    elif torch.is_tensor(y_true):
        y_true = y_true.detach().numpy().reshape(-1, 1)
        y_pred = y_pred.detach().numpy().reshape(-1, 1)

    y_true_mean = np.mean(y_true)
    y_pred_mean = np.mean(y_pred)
    y_true_std = np.std(y_true)
    y_pred_std = np.std(y_pred)

    cov = np.mean((y_true - y_true_mean) * (y_pred - y_pred_mean))
    rho = cov / (y_true_std * y_pred_std)

    ccc = 2 * rho * y_true_std * y_pred_std / (
        y_true_std ** 2 + y_pred_std ** 2 + (y_true_mean - y_pred_mean) ** 2
    )
    return ccc

I would appreciate it if you could give me some suggestions.

Optimox commented 1 year ago

I think you are confusing metric function and loss function. The metrics are scores you want to monitor during evaluation, loss function is the function you want to minimize during training.

The most important thing about the loss function is that it must be derivable and written in torch, so that auto_grad can compute the gradients. Here you are detaching the predictions from the graph, so the model can't update the gardients based on the final error. You should have two separate functions for metrics and loss. I would advise you to have a look at this notebook which explains how to customize the different functions : https://github.com/dreamquark-ai/tabnet/blob/develop/customizing_example.ipynb

statmasterY commented 1 year ago

Thanks a lot!

Optimox commented 1 year ago

Have you now solved your problem @statmasterY ?