🖼️ Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017. Expanded : Towards Personalized Image Captioning via Multimodal Memory Networks. In IEEE TPAMI, 2018.
Thanks for the greatwork! It is creative and the shown results are promising.
I saw from paper you have several baselines to be compared against your proposed CSMN (e.g. 1-nearest neighbor to user contents, RNN seq2seq with active vocabulary)
Would you release the implementations of those baselines?
Besides that, given the recent advancement on NLP (transformer, GPT-2 ... etc) , would you (and how would you) propose your CSMN differently under modern context (as of 2020)?
Thanks for the greatwork! It is creative and the shown results are promising. I saw from paper you have several baselines to be compared against your proposed CSMN (e.g. 1-nearest neighbor to user contents, RNN seq2seq with active vocabulary) Would you release the implementations of those baselines?
Besides that, given the recent advancement on NLP (transformer, GPT-2 ... etc) , would you (and how would you) propose your CSMN differently under modern context (as of 2020)?