Context. Autoregressive models (including Image Transformer) can be viewed as a one layer normalizing flow as described in this article Section 4. The article contains code which trains PixelCNN as a flow. I am trying to do the same for Image Transformer by using your implementation.
Question 1. PixelCNN generates in raster scan order. Does your Pixel Transformer implementation also use raster scan order?
Very nice job, the code is very nice!
Context. Autoregressive models (including Image Transformer) can be viewed as a one layer normalizing flow as described in this article Section 4. The article contains code which trains PixelCNN as a flow. I am trying to do the same for Image Transformer by using your implementation.
Question 1. PixelCNN generates in raster scan order. Does your Pixel Transformer implementation also use raster scan order?
https://github.com/sahajgarg/image_transformer/blob/d33b8d007299b434c62e068e1dad35b8a2688212/image_transformer.py#L208