activatedgeek / tight-pac-bayes

Code for PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization, NeurIPS 2022
Apache License 2.0
14 stars 3 forks source link
deep-learning machine-learning pac-bayes

Tight PAC-Bayes Compression Bounds

[]() []()

This repository hosts the code for [PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization]() by Sanae Lotfi*, Marc Finzi*, Sanyam Kapoor*, Andres Potapczynski*, Micah Goldblum, and Andrew Gordon Wilson.

Setup

conda env create -f environment.yml -n pactl

Setup the pactl package.

pip install -e .

Usage

We use Fire for CLI parsing.

Training Intrinsic Dimensionality Models

python experiments/train.py --dataset=cifar10 \
                            --model-name=resnet18k \
                            --base-width=64 \
                            --optimizer=adam \
                            --epochs=500 \
                            --lr=1e-3 \
                            --intrinsic_dim=1000 \
                            --intrinsic_mode=rdkronqr \
                            --seed=137

All arguments in the main method of experiments/train.py are valid CLI arguments. The most imporant ones are noted here:

Distributed Training

Distributed training is helpful for large datasets like Imagenet to spread computation over multiple GPUs. We rely on torchrun.

To use multiple GPUs on a single node, we need:

For the same run as above, we simply replace python with torchrun as:

CUDA_VISIBLE_DEVICES=0,1 \
torchrun --nproc_per_node=2 --rdzv_endpoint=localhost:9999 experiments/train.py ...

All remaining CLI arguments remain unchanged.

Transfer Learning using Existing Checkpoints

The key argument needed for transfer is the path to the configuration file named net.cfg.yml of the pretrained network.

python experiments/train.py --dataset=fmnist \
                            --optimizer=adam \
                            --epochs=500 \
                            --lr=1e-3 \
                            --intrinsic_dim=1000 \
                            --intrinsic_mode=rdkronqr \
                            --prenet_cfg_path=<path/to/net.cfg.yml> \
                            --seed=137 \
                            --transfer

In addition to earlier arguments, there is only one new key argument:

Training for Data-Dependent Bounds

Data-dependent bounds first require pre-training on a fixed subset of training data and then training an intrinsic dimensionality model on the remainder of the subset.

For such training, we use the following command:

python experiments/train_dd_priors.py --dataset=cifar10 \
                                      ...
                                      --indices_path=<path/to/index/list> \
                                      --train-subset=0.1 \
                                      --seed=137

The key new arguments here in addition to the ones seen previously are:

Computing our Adaptive Compression Bounds

Once we have the checkpoints of intrinsic-dimensionality models, the bound can be computed using:

python experiments/compute_bound.py --dataset=mnist \
                                    --misc-extra-bits=7 \
                                    --quant-epochs=30 \
                                    --levels=50 \
                                    --lr=0.0001 \
                                    --prenet_cfg_path=<path/to/net.cfg.yml> \
                                    --use_kmeans=True

The key arguments here are:

LICENSE

Apache 2.0