Open germanjke opened 4 months ago
Yes, there are! :) stay tuned!
Do you have any particular expectations for improvements with the upgrade to the 0.4.0+ backend?
@LSinev Hi, Looks like vllm engine which supported in 0.4.0 works faster than hf engine
hello guys! can I ask you, do you work on this topic, maybe you have some estimated dates? @LSinev
will give more information next week, or may be even branch for playing/testing work in progress
new_harness_codebase — "work in progress" branch with submoduled patched (waiting for PR to be merged) lm-evaluation-harness. All scores will change. Leaderboard will not publish these yet, but you can use for private scoring. Baseline models scoring should be done by you. Changes to model running code (lm-evaluation-harness side) should be done at their repository to be supported here.
great, thank you!
Hi @LSinev,
I noticed that the tasks from the branch do not include the MERA tasks in 0.4.x
format. I checked the link you provided here, and it seems they are indeed missing.
Could you please confirm if the MERA tasks will be added to this branch, or if there is another location where they might be available?
Thanks!
I checked the link you provided here, a
This link goes to fork of lm-evaluation-harness. In this fork there is a code needed for RuTiE task, which is PRed in lm-evaluation-harness, but not yet approved and merged. There is no plans yet to submit MERA tasks directly into lm-evaluation-harness.
new_harness_codebase is using 0.4.x code, but tasks are not in fully yaml format yet (will be, but not yet, just like, for example, SQUADv2 task in lm-evaluation-harness). MERA tasks are stored in https://github.com/ai-forever/MERA/tree/update/new_harness_codebase/benchmark_tasks as new code allows to use tasks from external directory.
Hi!
Your benchmarks are functioning well with version 0.3.0 of lm-evaluation-harness. Are there any plans to update and support version 0.4.0?