tmgrgg / localvsglobaluncertainty

Empirical analysis of recent stochastic gradient methods for approximate inference in Bayesian deep learning, including SWA-Gaussian, MultiSWAG, and deep ensembles. See report_localglobal.pdf.
2 stars 0 forks source link

Experiment 2a: Heat Map Analysis #4

Open tmgrgg opened 4 years ago

tmgrgg commented 4 years ago

I trained a MultiSWAG solution consisting of (up to) 15 models on DenseNet10 x FashionMNIST, increasing the rank of each individual SWAG solution incrementally to produce this heatmap demonstrating a broad picture of the complementary benefits of modelling local and global uncertainty:

heatmap

Observations:

tmgrgg commented 4 years ago

After plotting bands of constant space cost, I identify the top 10% of solutions in each band:

(Red lines represent each band of width 10, the dashed blue line is y = x).

best_solutions

Visually I'm not sure that this diagram actually shows anything interesting. (I'll get back to thinking about this)

The graph below however shows the best solutions for each fixed cost from 0 to 500 (the size of the star marker represents the cost of the solution) - it more clearly shows that with limited resources, we should dedicate them to ensembling. Note however that half of this diagram is missing to deliver a true comparison - but I do not expect ensembling further to increase the predictivity (this is somewhat clear from the constantness of the heatmap). However, I'll run the full 30 x 30 once I have access to the cluster.

foreachcost

tmgrgg commented 4 years ago

In the below plot I have grouped each solution/square into bands (of width 20) and have normalised each solutions valid loss by subtracting the group mean and dividing by group standard deviation.

standardised_across_storage_bands

Below is similar with band-width of 10:

standardised_across_bands_hot

tmgrgg commented 4 years ago
Screenshot 2020-08-16 at 21 26 17

I reran this experiment, this time training the SWAG solutions for a bit longer (150 training epochs, 150 swag epochs versus 50 epochs and 100 swag epochs), and we get a very contrasting picture.