Warning on `trust_remote_code` on datasets

aittalam commented 9 months ago

Running lm-harness with the following config:

# Model to evaluate
model:
  load_from: "tiiuae/falcon-7b"
  torch_dtype: "bfloat16"

# Settings specific to lm_harness.evaluate
evaluator:
  tasks: ["hellaswag", "mmlu"]
  num_fewshot: 5
  limit: 10

quantization:
  load_in_4bit: True
  bnb_4bit_quant_type: "fp4"
  bnb_4bit_compute_dtype: "bfloat16"

# Tracking info for where to log the run results
tracking:
  name: "tiiuae-falcon-7b-trust"
  project: "davide-testing-flamingo"
  entity: "mozilla-ai"

Returns the following warnings:

53 tmp/ray/session_2024-02-01_09-08-37_813595_7/runtime_resources/pip/1989896a12212ac88dcdfee7f5a5fef9a4046e6c/virtualenv/lib/python3.10/site-packages/datasets/load.py:1429: FutureWarning: The repository for hellaswag contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/hellaswag
54 You can avoid this message in future by passing the argument `trust_remote_code=True`.
55 Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.
56 ...
57 /tmp/ray/session_2024-02-01_09-08-37_813595_7/runtime_resources/pip/1989896a12212ac88dcdfee7f5a5fef9a4046e6c/virtualenv/lib/python3.10/site-packages/datasets/load.py:1429: FutureWarning: The repository for hails/mmlu_no_train contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/hails/mmlu_no_train
58 You can avoid this message in future by passing the argument `trust_remote_code=True`.
59 Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.

Example: http://10.145.91.219:8265/#/jobs/raysubmit_ap35cU8CVdhAtDaa

The problem does not disappear when trust_remote_code is added to the model section. It has to be added to load_dataset as shown here. This is not our problem rn but it might impact us at some point.

sfriedowitz commented 9 months ago

This is just an oversight, I didn't realize this parameeter was necessary for datasets. It should be added as a field on the DatasetConfig and wired through to the appropriate location for dataset loading.

aittalam commented 9 months ago

No worries! I think rather than your oversight it was something hidden to us due to lm-harness implementation. We call simple-evaluate out-of-the-box by passing the tasks (so no pointer to actual datasets yet), and the function does not allow to specify whether we want to trust remote code or not. This is why I wrote "this is not our problem", but still it is something worth keeping an eye on (perhaps we can send a pull request or see if they have already started thinking about a solution for it?)

sfriedowitz commented 9 months ago

Ahh, I see. If this is occurring from within lm-harness, then I don't think we have control over it for now.

We should monitor their repo to see if they release a new version, or add a flag in the simple_evaluate to trust remote code.

sfriedowitz commented 9 months ago

Or yes, as you said, we can contribute to and fix the issue from within lm-harness itself!

veekaybee commented 8 months ago

This is something I can look into from the lm-harness side

veekaybee commented 8 months ago

Started working on this here: https://github.com/EleutherAI/lm-evaluation-harness/issues/1135

veekaybee commented 8 months ago

See PR and discussion here: https://github.com/EleutherAI/lm-evaluation-harness/pull/1487

veekaybee commented 8 months ago

This is now addressed for hellaswag and other datasets mentioned in https://github.com/EleutherAI/lm-evaluation-harness/pull/1487 by the lm-harness 0.4.2 release, confirmed with a Ray run.

Note the other dataset still has a warning because we haven't set the flag in lm-harness.

2024-03-18:07:29:17,770 INFO     [evaluator.py:131] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
54/tmp/ray/session_2024-03-18_06-51-03_969458_8/runtime_resources/pip/2540f4210879a6062c59feb45343ed83b766f2d8/virtualenv/lib/python3.10/site-packages/datasets/load.py:1461: FutureWarning: The repository for hails/mmlu_no_train contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/hails/mmlu_no_train
55You can avoid this message in future by passing the argument `trust_remote_code=True`.

PR to update lm-buddy to fix this: https://github.com/mozilla-ai/lm-buddy/pull/85

mozilla-ai / lm-buddy

Warning on `trust_remote_code` on datasets #49