Find the best metric for evaluating the performance of the model.

tohiddar commented 2 years ago

Tensorflow and keras have metrics that can be used to evaluate how a model is performing. In our application, the metric should evaluate how good the generated captions are. More info on tensorflow metric function: https://www.tensorflow.org/api_docs/python/tf/keras/metrics More info on keras metric: https://keras.io/api/metrics/

For this task, you do not need to actually run the code and evaluate it. For now, we would like to collect all the information relevant to the metrics evaluation and comment on how we can apply them to our application and why we think they are appropriate.

tohiddar commented 2 years ago

@minoojaf Assigning this task to you. Let me know if you have any questions about it.

tohiddar commented 2 years ago

We talked about how the best metric might be through NLPs metrics. There are also a couple of papers that might help us understand how to evaluate an image captioning trained model.

minoojaf commented 2 years ago

BLEU, METEOR, SPICE, ROUGE, or CIDEr metrics are common metrics for image captioning. I think the best metric method is the learning-based evaluation metric, I put the link to this paper below. https://vision.cornell.edu/se3/wp-content/uploads/2018/03/1501.pdf Another paper that catches my eye and could help us in the process. https://arxiv.org/pdf/1809.02156.pdf

teamtma / Image_Captioning

Find the best metric for evaluating the performance of the model. #3