About the field image caption

sgrvinod / a-PyTorch-Tutorial-to-Image-Captioning

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

MIT License

2.79k stars 718 forks source link

About the field image caption #16

Closed ZHUXUHAN closed 5 years ago

ZHUXUHAN commented 6 years ago

In this field, how long have you been studying, and what publications have been published? I just started researching, can you recommend some publications ?

sgrvinod commented 6 years ago

Hi, I'm quite new to Image Captioning and computer vision in general, as my experience is mostly in NLP. In fact, one of the reasons I started doing these tutorials in computer vision is because it's a great way to learn. My academic background is in energy/environment, and not in machine/deep learning.

As you probably know, Show, Attend, and Tell is one of the older models, but it's a great place to start learning. If you want to learn newer, better image captioning models, you should definitely check out the following papers -

I hope to do tutorials on them later.

ZHUXUHAN commented 6 years ago

Hi, I'm quite new to Image Captioning and computer vision in general, as my experience is mostly in NLP. In fact, one of the reasons I started doing these tutorials in computer vision is because it's a great way to learn. My academic background is in energy/environment, and not in machine/deep learning.

As you probably know, Show, Attend, and Tell is one of the older models, but it's a great place to start learning. If you want to learn newer, better image captioning models, you should definitely check out the following papers -

Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning

Self-critical Sequence Training for Image Captioning

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

I hope to do tutorials on them later.

Thank you very much for your answers and have helped me a lot.

fawazsammani commented 6 years ago

@ZHUXUHAN if you'd like to extend your knowledge, then I have extended this implementation to implement "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning". You can find it here

ZHUXUHAN commented 6 years ago

Your project is very great, I will always study your latest projects,thank you very much for your guidance.