yandex-research / rtdl

Research on Tabular Deep Learning: Papers & Packages
Apache License 2.0
888 stars 98 forks source link
ai artificial-intelligence deep-learning machine-learning neural-network papers python pytorch research tabular tabular-data

RTDL (Research on Tabular Deep Learning)

RTDL (Research on Tabular Deep Learning) is a collection of papers and packages on deep learning for tabular data.

:bell: To follow announcements on new papers and projects:

[!NOTE] The previous rtdl package is now replaced with individual packages (see the next sections). If you used rtdl, please, read the details.

Show details 1. This repository is **NOT** deprecated. 2. However, the package `rtdl` is deprecated and replaced with individual packages. 3. If you used the latest `rtdl==0.0.13` installed from PyPI (not from GitHub!) as `pip install rtdl`, then the same models (MLP, ResNet, FT-Transformer) can be found in the `rtdl_revisiting_models` package, though API is slightly different. 4. :exclamation: **If you used the unfinished code from the main branch, it is highly** **recommended to switch to the new packages.** In particular, the unfinished implementation of embeddings for continuous features contained many unresolved issues (the new `rtdl_num_embeddings` package, in turn, is more efficient and correct).

Installation

The documentation is available through the "Package" links in the "Papers" section.

The following snippet installs all available packages including optional dependencies.

pip install rtdl_num_embeddings
pip install rtdl_revisiting_models

pip install "scikit-learn>=1.0,<2"

Papers

(2024) TabReD: A Benchmark of Tabular Machine Learning in-the-Wild
Paper   Code

(2023) TabR: Tabular Deep Learning Meets Nearest Neighbors
Paper   Code

(2022) TabDDPM: Modelling Tabular Data with Diffusion Models
Paper   Code

(2022) Revisiting Pretraining Objectives for Tabular Deep Learning
Paper   Code

(2022) On Embeddings for Numerical Features in Tabular Deep Learning
Paper   Code   Package (rtdl_num_embeddings)

(2021) Revisiting Deep Learning Models for Tabular Data
Paper   Code   Package (rtdl_revisiting_models)

(2019) Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data
Paper   Code