PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike. The core principles behind the design of the library are:
It has been built on the shoulders of giants like PyTorch(obviously), and PyTorch Lightning.
Although the installation includes PyTorch, the best and recommended way is to first install PyTorch from here, picking up the right CUDA version for your machine.
Once, you have got Pytorch installed, just use:
pip install -U “pytorch_tabular[extra]”
to install the complete library with extra dependencies (Weights&Biases & Plotly).
And :
pip install -U “pytorch_tabular”
for the bare essentials.
The sources for pytorchtabular can be downloaded from the Github repo
\.
You can either clone the public repository:
git clone git://github.com/manujosephv/pytorch_tabular
Once you have a copy of the source, you can install it with:
cd pytorch_tabular && pip install .[extra]
For complete Documentation with tutorials visit ReadTheDocs
Semi-Supervised Learning
To implement new models, see the How to implement new models tutorial. It covers basic as well as advanced architectures.
from pytorch_tabular import TabularModel
from pytorch_tabular.models import CategoryEmbeddingModelConfig
from pytorch_tabular.config import (
DataConfig,
OptimizerConfig,
TrainerConfig,
ExperimentConfig,
)
data_config = DataConfig(
target=[
"target"
], # target should always be a list.
continuous_cols=num_col_names,
categorical_cols=cat_col_names,
)
trainer_config = TrainerConfig(
auto_lr_find=True, # Runs the LRFinder to automatically derive a learning rate
batch_size=1024,
max_epochs=100,
)
optimizer_config = OptimizerConfig()
model_config = CategoryEmbeddingModelConfig(
task="classification",
layers="1024-512-512", # Number of nodes in each layer
activation="LeakyReLU", # Activation between each layers
learning_rate=1e-3,
)
tabular_model = TabularModel(
data_config=data_config,
model_config=model_config,
optimizer_config=optimizer_config,
trainer_config=trainer_config,
)
tabular_model.fit(train=train, validation=val)
result = tabular_model.evaluate(test)
pred_df = tabular_model.predict(test)
tabular_model.save_model("examples/basic")
loaded_model = TabularModel.load_model("examples/basic")
If you use PyTorch Tabular for a scientific publication, we would appreciate citations to the published software and the following paper:
@misc{joseph2021pytorch,
title={PyTorch Tabular: A Framework for Deep Learning with Tabular Data},
author={Manu Joseph},
year={2021},
eprint={2104.13638},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@software{manu_joseph_2023_7554473,
author = {Manu Joseph and
Jinu Sunil and
Jiri Borovec and
Chris Fonnesbeck and
jxtrbtk and
Andreas and
JulianRein and
Kushashwa Ravi Shrimali and
Luca Actis Grosso and
Sterling G. Baird and
Yinyu Nie},
title = {manujosephv/pytorch\_tabular: v1.0.1},
month = jan,
year = 2023,
publisher = {Zenodo},
version = {v1.0.1},
doi = {10.5281/zenodo.7554473},
url = {https://doi.org/10.5281/zenodo.7554473}
}