Code error and issues in Chapter02/image_captioning_pytorch.ipynb

a) In the forward function of class LSTMModel the parameter capts is used, but
the self.embedding_layer gets caps without the character t

def forward(self, input_features, capts, lens):
        """Decode image feature vectors and generates captions."""
        embeddings = self.embedding_layer(caps)

b) Model training on Mac OS X does only work with num_workers=0. The data is not loaded when num_workers is larger than 0 as used in the notebook.

# Build data loader
custom_data_loader = get_loader('data_dir/resized_images', 'data_dir/annotations/captions_train2014.json', vocabulary, 
                         transform, 128,
                         shuffle=True, num_workers=2)

Related PyTorch issues:

c) Proposal of improving the image captioning example in Chapter 2 of the book:

Show how a custom image dataset and dataloader can be written for an own dataset. The example given in the book just reiterates the example for the COCO dataset in the pytorch-tutorial https://github.com/yunjey/pytorch-tutorial/tree/master/tutorials/03-advanced/image_captioning but does not demonstrate how to do it for another custom dataset.
As the training of the model takes a long time and the perplexity of the batches only slightly improves after the first 1000 steps in epoch 1, it would be great to show how the pytorch learning rate scheduler can be used to improve model training on batch and on epoch level. For example https://discuss.pytorch.org/t/how-to-adjust-learning-rate-according-to-batch-step-rather-than-epoch/20690

d) I really like the explanation of transfomer models in the YouTube playlist Visual Guide to Transformers Neural Networks, which is more intuitiv compared to the explanation on page 149 "Understanding the transformer model architecture" of the Mastering PyTorch book.

PacktPublishing / Mastering-PyTorch

Code error and issues in Chapter02/image_captioning_pytorch.ipynb #6