EvolvingLMMs-Lab / lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval
https://lmms-lab.github.io/
Other
1.57k stars 128 forks source link

[Feature Request] Update Datasets Version, so that lmms-eval can be used in Offline Environment #335

Open jungle-gym-ac opened 2 days ago

jungle-gym-ac commented 2 days ago

I encountered a similar issue as This one, when running lmms-eval with an offline machine(no Internet). load_dataset method still tries to reach Hugging Face Hub when I set HF_DATASETS_OFFLINE to 1.

I looked into this issue from huggingface Datasets and found it is a bug from datasets library, where load_dataset method still tries to reach Hugging Face Hub after settingHF_DATASETS_OFFLINE to 1.

And the bug is fixed with this PR since Datasets Version 2.19.0. And it has been verified here that updating Datasets to newer version ACTUALLY enables lmms-eval to run without bug in offline environment.

So I suggest to update Datasets Version to >= 2.19.0 so that lmms-eval can be used in fully offline environment. Any future plans for that?

(Although there are currently some workarounds for running lmms-eval in offline environment 179 21, I think them inconvenient when you need to evaluate MANY tasks. And I think supporting lmms-eval in offline environment will help a lot of users.)

Luodian commented 1 day ago

Thanks for this nice suggestion!

Are you often use in offline environment? It's much appreciated that you can send a PR to modify the version and also give us some guidance by adding to ./docs/xxx.md to introduce the usage in offline environment.