Open rixejzvdl649 opened 2 months ago
Hi @rixejzvdl649,
I appreciate your interest in our work. Please provide some more information on how can we help. Is this the output generated by Video-ChatGPT? Thank You.
@mmaaz60 activity_qa accuracy is only showing 15% for me, which is not close to the paper's 35%, what is wrong with the code?
Hello, I've also been replicating related benchmarks recently, and these benchmarks are mostly based on GPT-assistant, which seems quite costly. I'd like to ask, approximately how much does each of your evaluations cost?
@hb-jw If you're considering the gpt4o-mini, I don't think it's going to cost much. under $10?
@hb-jw If you're considering the gpt4o-mini, I don't think it's going to cost much. under $10?
Thank you for your reply! I am using the GPT-3.5 Turbo API, and I have only tested 200 questions from the MSVD-QA in the zero-shot QA setup, which comprises about 1500 samples in total, and it has already cost me $0.90. Based on this calculation, testing the entire MSVD-QA would require $6. However, all the benchmarks (zero-shot QA + videochatgpt benchmark) require 9 similar tests. What are your thoughts on this? Could there be an error in my calculations?
evaluate_activitynet_qa
v_iKclcQEl4zI_10