krasserm / fairseq-image-captioning

Transformer-based image captioning extension for pytorch/fairseq
Apache License 2.0
312 stars 55 forks source link

Issue preprocessing according to the instructions #13

Closed alex-calderwood closed 4 years ago

alex-calderwood commented 4 years ago

There seems to be an issue with model/inception.py (a missing import)

Here's my error message: I don't see how I can fix this, since it seems to be a missing import and I assume this indicates the entire project is broken? Unless I'm missing something?

./preprocess_images.sh ms-coco/
  0%|                                                                                                                                                                             | 0/14161 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "preprocess/preprocess_images.py", line 62, in <module>
    main(parser.parse_args())
  File "preprocess/preprocess_images.py", line 39, in main
    outs = inception(imgs.to(args.device)).permute(0, 2, 3, 1).view(-1, 64, 2048)
  File "/home/ubuntu/anaconda3/envs/fairseq-image-captioning/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/fairseq-image-captioning/model/inception.py", line 25, in forward
    x = F.max_pool2d(x, kernel_size=3, stride=2)
NameError: name 'F' is not defined`

In the relevant file, you can see that F is never imported.

alex-calderwood commented 4 years ago

Adding the line import torch.nn.functional as F does seem to fix the issue, but the fact that this is unaddressed gives me pause to build on this framework... has anyone tried running this?

krasserm commented 4 years ago

Thanks @alex-calderwood for catching and fixing this bug. It is a regression from a recent refactoring and I didn't run grid feature extraction then. Sorry for the inconvenience, there should be integration tests that automatically raise such issues but as I've written in README, this project is still work in progress. Appreciate your contribution!

alex-calderwood commented 4 years ago

Awesome! And thank you for building this. I've been looking for a transformer-based captioning system and the architecture you've described is very cutting edge.