Inquiry about Costs Associated with Video LLM Benchmarks

Vision-CAIR / MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

https://vision-cair.github.io/Goldfish_website/

BSD 3-Clause "New" or "Revised" License

550 stars 60 forks source link

Inquiry about Costs Associated with Video LLM Benchmarks #32

Open hb-jw opened 3 months ago

hb-jw commented 3 months ago

Hello everyone,

I have been working on replicating benchmarks related to video-class Large Language Models (LLMs), and I've noticed that most of these benchmarks rely on the GPT-assistant framework. Given the complexity and potential costs associated with these benchmarks, I'm interested in gathering some feedback regarding the financial aspect of conducting such evaluations.

Could anyone share their experiences regarding the typical costs involved in running these benchmarks? Any insights into budgeting for such projects would be highly beneficial to the community.

Thank you!

KerolosAtef commented 3 months ago

Hello @hb-jw I can't remember the exact cost for the evaluation because I used shared API for multiple projects. But this is how to calculate the estimation: each request requires about 800 tokens, so to estimate the cost, you can multiply the number of samples in each benchmark times 800 token to get the total tokens. then see the pricing OpenAI pricing

total number of tokens = number_of_requests * 800 
Estimated cost = (total number of tokens / 1000,000 )*(price_per_1M)

Here I used GPT 3.5 for evaluation which costs 1.5/1M token for inputs