Open TIan1874 opened 1 year ago
We use the evaluation toolkits from https://github.com/jmhessel/clipscore to compute the metrics.
We use the evaluation toolkits from https://github.com/jmhessel/clipscore to compute the metrics.
Thanks very much!
I also see the diversity metrics, Vocab, S-C, Div-1, Div-2 are calculated in the paper, is there any related code for diversity metric evaluation that could be provided? @joeyz0z
Diversity metrics are based on https://github.com/qingzwang/DiversityMetrics
About diversity metric div-1, div-2: I followed the metrics calculation link "Diversity metrics are based on https://github.com/qingzwang/DiversityMetrics" that you guys put on the GitHub issue, but I would like to know how to calculate n-gram diversity metrics based on that link since I didn't find a way to calculate div-2 among the links you provided.
Here is the code: https://github.com/joeyz0z/ConZIC/blob/main/compute_n_div.py
I want to know if the prompt is reserved when calculating evaluation metrics. Thank you very much for your excellent work!
yes, we reserved the prompt
Thank you very much for your reply, I am following your work! I have a question about the vocab metric, are there 25,000 (5,000 images, 5 captions per image) captions involved in the calculation of this metric? If so, are stop-words other than stop_words.txt being introduced?
Thank you very much for your excellent work! I want to reproduce the work, and I am wondering if you could make public the code for calculating the metrics in the paper, such as BLEU-4, METEOR, CIDEr, SPICE, and RefCLIPScore. I would appreciate it if you could take the time to reply to me.