Clarification on IS Score Range for Raw Metrics

KihongK commented 1 month ago

Hello EvalCrafter team,

I’ve been examining the Raw Metrics table on EvalCrafter’s GitHub Pages, and I’m curious about the IS column. When I run is.py from the repository on my dataset, the scores I receive are typically between 0 and 2. However, in the Raw Metrics table, I see IS scores ranging from 14 to 17.

Could you help me understand how to replicate these higher IS scores? I’d appreciate knowing if any additional calculations were applied to the IS scores shown in the Raw Metrics or if there’s a recommended dataset that could produce similar values for easier comparison.

Thank you very much for your help!

Yaofang-Liu commented 4 weeks ago

Hi KihongK, you may first reproduce the results with our current code in this repo and the datasets here https://huggingface.co/datasets/RaphaelLiu/EvalCrafter_T2V_Dataset. If you still have problems, you may tell me more details about your datasets. Hope you can figure out this problem :-)

KihongK commented 3 weeks ago

Hi Yaofang-Liu Sorry it was my mistake.

For IS evaluation, I should have used at least 50 datasets, but I tested with one video, which led to the low performance. thank you for reply

evalcrafter / EvalCrafter

Clarification on IS Score Range for Raw Metrics #21