Wentong-DST / im2p

Tensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs
MIT License
15 stars 5 forks source link

Problem in evaluation #4

Open JohnDreamer opened 5 years ago

JohnDreamer commented 5 years ago

I'm confused how to evaluate. Should I regard the whole paragraph (multi-sentences) as a large sentence and regard the ground truth as a sentence, either? Then put them into bleu, cider (and so on) to evaluate? Or should I change the code of bleu.py and cider.py to evaluate the paragraphs by one sentence (generated) matching one sentence (ground truth)? Hope you can help me with this! Thank you!