Closed veekaybee closed 8 months ago
What does lm_eval.tasks.ALL_TASKS
return now? Is it the full list of tasks without the need to do any initialization?
If so this is great, as it will solve the issue we had when multiple inits occurred!
What does lm_eval.tasks.ALL_TASKS return now?
This was changed to task_manager.all_tasks
https://github.com/EleutherAI/lm-evaluation-harness/pull/1321/files#diff-45b7cf15ed225746696faea6973a591c9526c8a78f486e28b11f63635e543666R13.
I think this initializes all tasks, but we can do some testing to be sure.
What does lm_eval.tasks.ALL_TASKS return now?
This was changed to
task_manager.all_tasks
https://github.com/EleutherAI/lm-evaluation-harness/pull/1321/files#diff-45b7cf15ed225746696faea6973a591c9526c8a78f486e28b11f63635e543666R13.I think this initializes all tasks, but we can do some testing to be sure.
I looked into the code, it seems that a new TaskManager
(which performs tasks initialization at init time) is created if none is specified, so that should fix the problem that required us to manually run the initialization.
@sfriedowitz can you test the new code in your new environment once it is merged so we can see if that solves the concurrence issue? I think it should, but if not we can move the lazy TaskManager creation up into our code and just pass it to simple_evaluate
as a parameter.
What's changing
LM-eval has now been bumped to
0.4.2
, incorporatingtrust_remote_code
=True
as a default for a select set of curated datasets, removing the warning for those datasets specifically. Also bumpingdatasets
as a result.In the change, we also remove the need to call
lm_eval.tasks.initialize_tasks()
How to test it
Run a test Ray job using the evaluate entrypoint to evaluate using some default params:
Related Jira Ticket
Additional notes for reviewers