UARK-AICV / VLTinT

[AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
https://uark-aicv.github.io/VLTinT/
65 stars 6 forks source link

The Cider score for YouCookII are different #14

Open yangxingrui opened 10 months ago

yangxingrui commented 10 months ago

Similarly, I also trained the model in the dataset YouCookII and used the same configuration you provided in GitHub. But my Cider score is much lower than your Cider score. Here is a comparison: METEOR | ROGUE_L | CIDEr | Bleu_4 Your Scores 17.94 | 34.55 | 48.7 | 9.4 My Scores 17.27 | 34.3 | 43.71| 9.11 A similar incident occurred on VLCAP.

Kashu7100 commented 10 months ago

Thank you for your interest in our work and sorry for the delay of the reply. I cannot tell why you are getting the low CIDEr score but happy to help you figure out. For the time being, I attached the evaluation JSON file for the paper.

model_best_greedy_pred_val_all_metrics.json

If you can share with me how you setup the data (dataloader, feature extraction, etc.), maybe I can help you debug the cause.

yangxingrui commented 7 months ago

Sorry for the long delay in continuing our discussion. Previously, I was occupied with other matters. The dataloader I'm using is from your published work on VLCAP, and the feature extraction is based on the Data preparation section of your VLTint, without any modifications to its content.