trust_remote_code is now mandatory to load dataset specifically with a script.
Solution/Feature
Add trust_remote_code as an input to create_custom_tasks_module() then in get_custom_tasks() and in Pipeline.pipeline_parameters (lighteval.tasks.registry).
class Pipeline:
def _init_tasks_and_requests(self, tasks):
with htrack_block("Tasks loading"):
with local_ranks_zero_first() if self.launcher_type == ParallelismManager.NANOTRON else nullcontext():
# If some tasks are provided as task groups, we load them separately
custom_tasks = self.pipeline_parameters.custom_tasks_directory
tasks_groups_dict = None
if custom_tasks:
_, tasks_groups_dict = get_custom_tasks(custom_tasks, trust_remote_code=self.pipeline_parameters.trust_remote_code)
if tasks_groups_dict and tasks in tasks_groups_dict:
tasks = tasks_groups_dict[tasks]
...
Maybe a more general fix could be to pass a generic **kwargs as an additional input to get_custom_tasks().
Issue encountered
trust_remote_code
is now mandatory to load dataset specifically with a script.Solution/Feature
Add
trust_remote_code
as an input tocreate_custom_tasks_module()
then inget_custom_tasks()
and inPipeline.pipeline_parameters
(lighteval.tasks.registry
).Maybe a more general fix could be to pass a generic **kwargs as an additional input to
get_custom_tasks()
.