Open MagicPupu opened 2 months ago
Take a look at the Contributor Page, data has to be entered manually and backed up by a source
Thank you for your response.
I wanted to know how the result data is generated and whether you perform any manipulation of this data, or if it comes directly from the benchmarks. Now I understand that I need to look directly into the sources of the benchmarks.
Thank you again, and I will continue my research.
Best regards, Antoine
Hello IroncladDev Team,
I would like to use llmarena to test some LLMs and generate performance reports.
However, I am unsure about how the results are calculated. Is the percentage of success for each test dataset quantitative or qualitative? Specifically, is it the percentage of correct answers within the dataset, or the percentage of the precision of its answers?
Thanks in advance, Antoine