Closed taokz closed 1 year ago
@5uperpalo, can you have a look to this? 👆🏼
sure, I'll look into it and respond by tomorrow lunch
@taokz both issues are likely connected to the data you are using; can your share sample of the data? to anonymize it, you may change column names and use simple .sample()
method on the dataframe ...
pip install -r requirements.txt -U
pip install -r requirements.txt -U
verbose=True
do yu have a non-NA loss? best_epoch is saved only if the monitor is is working and if monitored metric improves (be default validation loss ); do you see non-NA verbose output?Perhaps the reason for the issue is that there are NaN values in your data. I faced a similar problem, but it was resolved when I used dropna() on my data.
Perhaps the reason for the issue is that there are NaN values in your data. I faced a similar problem, but it was resolved when I used dropna() on my data.
@ibowennn Thank you for your reminder. However, I have checked my data and there are no NaN values.
@5uperpalo I kindly appreciate your quick reply.
For issue 1, I noticed that I wrongly used the fit() method:
trainer.fit(
X_tab=X_num_train,
target=y_train,
X_tab_val=X_num_valid, # there is no such a variable in the base_trainer
target_val=y_valid, # there is no such a variable in the base_trainer
n_epochs=2,
batch_size=1024,
)
I modified it to be the following, and it works (for TabNet).
trainer.fit(
X_tab=X_tab,
target=target,
n_epochs=2,
batch_size=1024,
val_split=0.2
)
However, I still can not solve the issue 2, and there is nan loss (for transfomers such as tab_transformer). I guess it is because I pass cat_embed_input=None, because my data just have continuous features. Is it required to set cat_embed_input != None for transformer-based models? The example notebook may has wrong link. Do you mean This?
BTW, I am sorry that the data is private and I can not share it here. We can just take it as a table with only numerical values, and there is no NaN.
I am sorry for late response @taokz, I had some personal issues holding me back... last time I did not upload the latest version of the troubleshooting notebook and yes, you were right I used the wrong link. I updated the troubleshooting notebook I posted earlier. There you can see in the section ISSUE num.2
that TabTransformer can work without categorical features.
As you are working with private/proprietary data, I would suggest the following:
binary
the los default to BCEWithLogitsLoss
, i.e. here and here
import pdb; pdb.set_trace()
inside the trainer.fit()
to debug what ground truth and predicted values are you sending to loss function, e.g. use pdb.set_trace()
here or hereXavierNormal
Note: Please let me know if any of this helped. It could help other people, including us, if we come across the same issue.
hi @taokz , here is a full functional code using a dataset with ALL continuous cols. Maybe you could use this as a starting point to fix the issue you are experiencing:
import numpy as np
from pytorch_widedeep import Trainer
from pytorch_widedeep.models import TabTransformer, WideDeep
from pytorch_widedeep.datasets import load_california_housing
from pytorch_widedeep.callbacks import (
EarlyStopping,
ModelCheckpoint,
)
from pytorch_widedeep.preprocessing import TabPreprocessor
if __name__ == "__main__":
df = load_california_housing(as_frame=True)
df["location_x"] = np.cos(df.Latitude) * np.cos(
df.Longitude
)
df["location_y"] = np.cos(df.Longitude) * np.sin(
df.Longitude
)
df.drop(["Latitude", "Longitude"], axis=1, inplace=True)
target_col = "MedHouseVal"
target = df[target_col].values
continuous_cols = [c for c in df.columns if c != target_col]
tab_preprocessor = TabPreprocessor(
continuous_cols=continuous_cols,
cols_to_scale=continuous_cols,
for_transformer=True,
)
X_tab = tab_preprocessor.fit_transform(df)
tab_transformer = TabTransformer(
column_idx=tab_preprocessor.column_idx,
continuous_cols=continuous_cols,
embed_continuous=True,
input_dim=8,
n_blocks=1,
n_heads=2,
)
model = WideDeep(deeptabular=tab_transformer)
callbacks = [
EarlyStopping(patience=2),
ModelCheckpoint(filepath="model_weights/wd_out"),
]
trainer = Trainer(
model,
objective="regression",
callbacks=callbacks,
)
trainer.fit(
X_tab=X_tab,
target=target,
n_epochs=10,
batch_size=128,
val_split=0.2,
)
@5uperpalo @jrzaurin Thank you for your detailed guidance! I am recently focusing on the other project, so I did not response in time. I appreciate your time and efforts!
Hi
Thank you for your awesome repo. I encountered two issues:
I appreciate your help in advance! Thanks!