Closed SchererM99 closed 2 years ago
Hi @SchererM99 ,
Did you save your model using the method save_model
as in:
saving_path_name = "./tabnet_model_test_1"
saved_filepath = clf.save_model(saving_path_name)
where clf
is a TabNetClassifier
?
Hi @eduardocarvp,
Yes, I followed the part about saving and loading directly from the Readme.
@SchererM99,
I'd like to help but without any minimal reproducible code it's hard. Moreover, you can find dozens of counter examples everywhere showing that the expected behavior you are asking for is actually working. It's even tested in the CI of the repo.
So the real problem probably comes from your code and not the library.
Please check the size of the numpy array you are giving for prediction, it should be (N_samples, N_features)
@SchererM99 do you have more information to share ? a short script to reproduce the error ? more info on your data ? a code sample ?
@Optimox Yes, I wrote a short script which produces the error for me. I also shared the dataframes and the model below. Thank you for your help!
import pandas as pd
from pytorch_tabnet.tab_model import TabNetClassifier
X_test = pd.read_csv("xtestp94.csv")
y_test = pd.read_csv("ytestp94.csv")
X_test = X_test.drop(columns="Unnamed: 0")
y_test = y_test.drop(columns="Unnamed: 0")
X_test_tabnet = X_test[X_test.columns].values
y_test_tabnet = y_test["decision"].values
best_net = TabNetClassifier()
best_net.load_model("models/steps_7_gamma_1.5_indepglus_4_sharedglus_1_lambdasparse_0.001.zip")
y_pred = best_net.predict(X_test_tabnet)
steps_7_gamma_1.5_indepglus_4_sharedglus_1_lambdasparse_0.001.zip xtestp94.csv ytestp94.csv
Have you tried giving a float numpy matrix to the model ?
Like this X_test_tabnet = X_test.values.astype(float)
?
@SchererM99 feel free to reopen if type float does not solve your problem.
Describe the bug
After training the model, when loading it and using the predict method I get TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found object
What is the current behavior? Using either unseen test data or the data used for training the error above shows up, no changes to the data where made inbetween training and prediciton.
If the current behavior is a bug, please provide the steps to reproduce.
This could be difficult to reproduce, as I think it depends on the data. The network is trained on a (319912, 53) dataset, consisting of integer and boolean columns. The test data is similar, but smaller.
Expected behavior
The method should return the predictions of the model instead.
Screenshots
Other relevant information: python version: 3.10 Operating System: Windows 10 Additional tools: PyCharm, Jupyter Notebook
Additional context
-