weight init
In the original paper, he initialization is used instead of xavier, this would translate to using: tf.contrib.layers.variance_scaling_initializer
max pooling
In the original paper, the transition down should be a 2x2 max pool that is non-overlapping. Checking the original Lasagne code it seems they use a 2x2 max pool with a window of 2x2 and stride of 2x2. Wasn't sure if your current 4x4 window as intentional.
Yeah sorry about those, I made this because of a project I was working on, so I may have changed the pooling window and weight inits. Good catch, I'll restore them to the originals.
tf.contrib.layers.variance_scaling_initializer
Great work on this btw, very readable!