BingzhaoZhu / XTab

30 stars 1 forks source link

XTab: Cross-Table Pretrained Transformers

XTab provides a framework for pretraining tabular transformers. Our implementation is based on OpenML-AutoMLBenchmark and AutoGluon.

The pretraining process of XTab uses a distributed training framework provided by OpenML-AutoMLBenchmark. Currently, we only support pretraining using AWS EC2 instances with at least one GPU (e.g., G4dn). The following requirements have to be fullfilled in order to repeat our expermients:

We blocked our information (GitHub repo, S3 bucket, etc) for review purpose. Above steps must be taken to make the code functional. We strongly encourage users to refer to the AWS mode of OpenML-AutoMLBenchmark for any questions.

Our code is implemented in AutoGluon.multimodel. The AutoGluon package must also be publich on GitHub for EC2 instances to download and run. XTab is developed based on AutoGluon 0.5.3. Please refer to AutoGluon for questions on AutoGluon installation or usage.

After configuration, the following code will run cross-table pretraining and save the pretrained backbones on S3.

rm -f ~/.config/automlbenchmark/config.yaml
cp ./examples/config_gpu.yaml ~/.config/automlbenchmark/config.yaml
python runbenchmark.py XTab_pretrain ag_pretrain mytest1h -m aws -p 520

Examples:

In the paper, all finetuning experiments are also performed using a distribued setting. Here, we provide a minimal example that can be run locally. We provide the pretrained backbones after 0, 1000, and 2000 iterations in ./pretrained checkpoints. To show the downstream performance on the Adult Income dataset. Simply do the following:

python runexample.py --pretrained_ckpts path-to-pretrained-checkpoints

Use --batch_size to specify batch size (default 128) and --max_epochs to specify maximum finetuning epochs (default 3). If you specify a checkpoint that does not exist, the model will train from randomly initialized weights. Results are:

pretraining steps Test AUC Validation AUC Train time Test time
iter_0 (no pretraining) 0.918905 0.922689 33.828741 2.028967
iter_1k (pretraining for 1k iterations) 0.919736 0.923456 33.760371 2.052341