huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
MIT License
845 stars 100 forks source link

[FT] pass trust_remote_code as flag for loading datasets with custom code #314

Open chuandudx opened 2 months ago

chuandudx commented 2 months ago

Issue encountered

When trying to run an evaluation while loading a dataset with custom code, there doesn't seem to be an easy way to set the option trust_remote_code=True.

The command line will say this:

The repository for swiss_leading_decision_summarization_eval contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/swiss_leading_decision_summarization_eval.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] 

If the user can't type in y quick enough (not within 10 ish seconds), there is an error:

  File "/Users/chuandu/Documents/workspace/legal_llm_evaluation/llm_eval_env/lib/python3.10/site-packages/datasets/load.py", line 131, in resolve_trust_remote_code
    raise ValueError(
ValueError: The repository for swiss_leading_decision_summarization_eval contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/swiss_leading_decision_summarization_eval.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.

Solution/Feature

Currently we can pass trust_remote_code=True for the model, but a separate variable is needed for loading datasets with custom code --model_args "pretrained=Qwen/Qwen2-0.5B-Instruct,trust_remote_code=True"

A similar way to pass this to the lighteval CLI would be great.

clefourrier commented 1 month ago

The parameter you are looking for (in custom tasks where you define your own LightevalTaskConfig) is here. Sorry I missed the issue - does that answer it for you?