huggingface / lighteval

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

MIT License

467 stars 54 forks source link

CLI list tasks #142

Closed DimbyTa closed 2 months ago

DimbyTa commented 3 months ago

Creation of a CLI command to list tasks #51

Created a directory commands under src/lighteval for future commands
Created the python script lighteval_cli.py
Modified mt_bench's main.py: mt_bench_metric = SampleLevelMetricGrouping to mt_bench_metrics = SampleLevelMetricGrouping, just to make sure there would be no naming conflict with the mt_bench_metric function. Updated extend_enum accordingly.
Modified setup.py, added entry_points for console_scripts.

setup_modif extend_enum_mt_bench_after extend_enum_mt_bench_before mt_bench_metric_after mt_bench_metric_before

clefourrier commented 2 months ago

Thanks for the PR, I'll give it a spin next week and tell you if it worked well :)

DimbyTa commented 2 months ago

@clefourrier , thank you for reviewing the PR, I appreciate it :)

I already gave the License to The HuggingFace Team, and reverted the naming in tasks/extended/mt_bench/main.py It leaves the issue on using setup.py or to include the entry points in pyproject.toml, I can adapt to your needs.

Thanks again, :)

clefourrier commented 2 months ago

Agree for maintainability. However, this raises a question: do we want to convert our already existing script to an actual CLI (to be able to do lighteval list_evals or `lighteval --accelerate --...)? (Would expand the scope of the PR)

NathanHB commented 2 months ago

Agree for maintainability. However, this raises a question: do we want to convert our already existing script to an actual CLI (to be able to do lighteval list_evals or `lighteval --accelerate --...)? (Would expand the scope of the PR)

is it possible to do something like, accelerate lighteval .... ?

clefourrier commented 2 months ago

@muellerzr ? can we use accelerate to launch a lib?

muellerzr commented 2 months ago

Yes you can, you can do accelerate launch -m mymodule. Though (and we can discuss this offline) I can show you how to straight up bypass that and call it in your code internally. So it just looks like lighteval dothing

DimbyTa commented 2 months ago

@muellerzr, @clefourrier, @NathanHB . Thank you all for reviewing the PR and commenting on it!

DimbyTa commented 2 months ago

@clefourrier , @NathanHB Could you provide the new requirements? For me to know where we are heading. Thank you! :)

clefourrier commented 2 months ago

Of course, give us a day to sync on this internally, and we'll come back to you on Wednesday :)

DimbyTa commented 2 months ago

Thank you, I will be waiting. :)

clefourrier commented 2 months ago

Hi! We discussed it internally, and it would be great if you could indeed modify the pytoml, to allow lighteval --list-tasks. For the rest (launching the rest of the options of lighteval with a CLI), we'll manage it in another PR once yours is merged :)

DimbyTa commented 2 months ago

Hello, I understand. I will revert setup.py to its original state and migrate the console entry point to pytoml.

DimbyTa commented 2 months ago

@clefourrier , @NathanHB Thank you for reviewing the PR, and for the feedback. The task is done now :)

DimbyTa commented 2 months ago

Sorry for all those commits, the ISP here in my country priced up their services and cut down the available Internet data right after I took an engagement to work on Issue #51, so I couldn't install lighteval on my local computer, I didn't have enough internet data volume to cover all the required dependencies. I'm using Google Colab to test all the modifications I make, so I'm pushing before testing rather than testing then pushing :sweat_smile: . It's not ideal but I really want to work on this, so that's the solution I found.

Anyway, I always fail the quality check but when I check what is wrong with the formatting there is no error showing... issue_formatting

DimbyTa commented 2 months ago

The command works fine, just a warning because of the inexistence of the OpenAI API key

lighteval_cli

clefourrier commented 2 months ago

No problem for the multiple commits, we squash PRs before merging them, so no trouble really. We'll test it out on our side then merge if it's good. For style, we usually use the pre-commit hook, you'll find the commands to run there (ruff --fix and ruff format)

DimbyTa commented 2 months ago

@clefourrier Thank you, I will check it out.

NathanHB commented 2 months ago

Hi ! For the quality check you can simply use make style after installing the correct ruff version on your python env. (with pip install -e .[dev])

DimbyTa commented 2 months ago

Thanks for the advice, @NathanHB.

DimbyTa commented 2 months ago

@clefourrier , @NathanHB , @muellerzr , thank you all for reviewing this PR!

Thank you, for your advice and feedback as well. I appreciate it.