alasdairtran / transform-and-tell

[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning
https://transform-and-tell.ml/
89 stars 14 forks source link

Attention is None #35

Open g-luo opened 2 years ago

g-luo commented 2 years ago

I'm getting an error in this line https://github.com/alasdairtran/transform-and-tell/blob/c8da745646fbc4a823079cbe7bc8b75659b770b6/tell/models/decoder_faces_objects.py#L289 where the attn seems to be None -- @alasdairtran do you have any insight as to why this error might be occurring?

For context I am using my own custom dataset.

Thanks!

        attns['image'] = attn.cpu().detach().numpy()
alasdairtran commented 2 years ago

My bad. I think I added that line when I was working on my camera ready to extract the attention scores. But attention scores are only generated during evaluation and not training (to save some computation).

I've added a commit to fix it (it's just a check whether attn is None or not).

g-luo commented 2 years ago

Thanks Alasdair! That resolved the issue. Another question, I'm getting this warning and I'm wondering if it's safe to ignore:

/opt/conda/conda-bld/pytorch_1591914855613/work/aten/src/ATen/native/cuda/RangeFactories.cu:239: UserWarning: The number of elements in the out tensor of shape [103] is 103 which does not match the computed number of elements 112. Note that this may occur as a result of rounding error. The out tensor will be resized to a tensor of shape (112,).
alasdairtran commented 2 years ago

I remember seeing similar warnings. It should be safe to ignore. I think it has to do with the way the sinusoidal positional embeddings are resized on the fly when it sees a longer input.