mlabonne / llm-autoeval

Automatically evaluate your LLMs in Google Colab
MIT License
460 stars 77 forks source link

Runpod.sh Refactor and new implementations. #14

Open Steel-skull opened 5 months ago

Steel-skull commented 5 months ago

TLDR: refactored the code on Runpod.sh and Fixed any issues this caused in downstream files.

Updated repo from https://github.com/dmahan93/lm-evaluation-harness to https://github.com/EleutherAI/lm-evaluation-harness and made necessary changes to Table.py to fix any issues with the format

Runpod.sh:

GPU Detection and Parallelization Logic: Added logic to set a flag for parallelization if multiple GPUs are detected.

Test Quantization: Setup ability to quantize models for testing.

Package Installation: Added installation of deepspeed, gekko and AUTO_GPTQ libraries

Environment Variable Handling: Introduced LOAD_IN_4BIT and AUTOGPTQ environment variables with default fallbacks.

Script Structure: Introduced functions run_benchmark_nous and run_benchmark_openllm to encapsulate the logic for running any future benchmarks.

Refactoring the Benchmark Execution: Replaced benchmark execution blocks with calls to the newly defined functions.

Improved Error Handling and Logging: Added a check for the presence of main.py before attempting to run it. Replaced pathing for main.py, and others to solve issues with "Cannot find file"

Infinite Sleep for Debug Mode: Added a message to indicate that the script is in debug mode, providing clarity during script execution.

Steel-skull commented 5 months ago

changes will need to be made to the colab to add 4bit function

Steel-skull commented 5 months ago

adding 4bit had increased processing times i will be looking into how to fix this i think i need to push a variable to gptq

Warning in runpod:
UserWarning: Input type into Linear4bit is torch.float16, but bnb_4bit_compute_dtype=torch.float32 (default). This will lead to slow inference or training speed.

need to push variable bnb_4bit_compute_dtype= "auto" (first attempt did not work)

may need to add a function to detect the dtype then push to gptq.

future ideas....

Steel-skull commented 5 months ago

Running without issue: -AGIEval
-GPT4All

Failing to run? (Need to test further.) -TruthfulQA -Bigbench

Steel-skull commented 5 months ago

converting to draft until i am able to solve discovered issues