TianjinYellow / EdgeDeviceLLMCompetition-Starting-Kit

27 stars 4 forks source link

Starting Kit for Edge-Device LLM Competition, NeurIPS 2024

This is the starting kit for the Edge-Device LLM Competition, a NeurIPS 2024 competition. To learn more about the competition, please see the competition website. This starting kit provides instructions on downloading data, running evaluations, and generating submissions.

Please join us on Discord for discussions and up-to-date announcements:

https://discord.gg/yD89SPNr3b

Evaluation for CommonsenseQA, BIG-Bench Hard, GSM8K, LongBench, HumanEval, CHID, TruthfulQA Tasks

Open Evaluation Task

The evaluation of CommonsenseQA, BIG-Bench Hard, GSM8K, LongBench, HumanEval, CHID, and TruthfulQA is conducted using the Opencompass tool.

Environment setup

  conda create --name opencompass python=3.10 
  conda activate opencompass
  conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
  pip install Faiss-gpu
  cd opencompass && pip install -e .
  cd opencompass/human-eval && pip install -e .

Pretrained Model Preparation for Track-1

Data Preparation

# Download dataset to data/ folder
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip

Evaluation Huggingface models

CUDA_VISIBLE_DEVICES=0 python run.py --datasets commonsenseqa_gen longbench bbh_gen gsm8k_gen humaneval_gen FewCLUE_chid_gen truthfulqa_gen --hf-num-gpus 1 --hf-type base --hf-path microsoft/phi-2 --debug --model-kwargs device_map='auto' trust_remote_code=True
## --dataset: specify datasets

Evaluate local models

CUDA_VISIBLE_DEVICES=0 python run.py --datasets commonsenseqa_gen longbench bbh_gen gsm8k_gen humaneval_gen FewCLUE_chid_gen truthfulqa_gen --hf-num-gpus 1 --hf-type base --models example --debug --model-kwargs device_map='auto' trust_remote_code=True
# --models: specify the local model

[!TIP]

-- The wrapped model file (.py) needs to be placed under the folder: opencompass/opencompass/models.

-- The prepared configuration file needs be placed under the folder: /opencompass/configs.

GPU Memory Usage and Throughput Measurement

# Replace the model/tokenizer loader code with your code. DO NOT CHANGE THE HYPER-PARAMETER SETTING.
python EvaluateThroughputAndMemory.py --model_name MODEL_NAME

[!Note]

-- batch_size needs to be set to 1 and max_length needs to be set to 2K.

Compile Model via MLC-MiniCPM

A Step by Step instruction are presented in the following document:

Prepare Environment

Follow https://llm.mlc.ai/docs/deploy/android.html to prepare requirements.

For the Compile PyTorch Models from HuggingFace, conduct the following instructions to install mlc_chat.

mkdir -p build && cd build
# generate build configuration
python3 ../cmake/gen_cmake_config.py && cd ..
# build `mlc_chat_cli`
cd build && cmake .. && cmake --build . --parallel $(nproc) && cd ..
# install
cd python && pip install -e . && cd ..

Compile Model Refer to https://github.com/OpenBMB/mlc-MiniCPM

put huggingface downloaded model checkpoint into dist/models.

MODEL_NAME=MiniCPM
MODEL_TYPE=minicpm
mlc_chat convert_weight --model-type ${MODEL_TYPE} ./dist/models/${MODEL_NAME}-hf/  -o dist/$MODEL_NAME/
mlc_chat gen_config --model-type ${MODEL_TYPE} ./dist/models/${MODEL_NAME}-hf/ --conv-template LM --sliding-window-size 768 -o dist/${MODEL_NAME}/
mlc_chat compile --model-type ${MODEL_TYPE} dist/${MODEL_NAME}/mlc-chat-config.json --device android -o ./dist/libs/${MODEL_NAME}-android.tar
cd ./android/library
./prepare_libs.sh
cd -

Submissions Requirements:

Please upload all the required materials to a GitHub repository and submit the repository link to us via the submission form. The repository should contain:

An example of submission format can be found in Submission_Example folder