amosproj / amos2024ss08-cloud-native-llm

MIT License
6 stars 1 forks source link

Conduct Largue Langage Base Model Selection Process #19

Closed grayJiaaoLi closed 1 month ago

grayJiaaoLi commented 1 month ago

User story

  1. As a Software Developer
  2. I need to select a Large Language Base Model based on requirements of the project by a LLM selection process
  3. So that our created LLM will perform best.

Acceptance criteria

DoD general criteria

anosh-ar commented 1 month ago

Comparation of candidate models

Here is a comparation between some prominated base LLM models(some already finetuned marked with *). In selecting the models the followings are considered:

Considering the above the proposed models to start with would be:

  1. Llama3_8b
  2. Gemma_7b
  3. Llama3_70b
Model Model size HuggingFace Avg ARC HellaSwag MMLU HumanEval AGIEval(chat) license
Gemma 7b 64.3 61 82.5 66 32.3 41.7 gemma(Allows redestribution with noting some text)
Llama3 70b 77.8 71.42 85.7 80 81.7 63 Llama(Allows redestribution with noting some text)
Llama3-instruct 8b 66.8 60.7 78.5 67.07 62.2 Llama(Allows redestribution with noting some text)
Mistral 7b 61 60 83 64 Apache 2.0
Calme-7B-Instruct-v0.9* 7b 76 73 89 64 Apache 2.0
Mixtral-8x22b-Instruct* 141b 79.1 72.7 89 77.7 Apache 2.0
Zephyr-orpo-141b-A35b* 141b NA NA NA NA NA 44.16 Apache 2.0

Database form:

Usually databases are in a .json format with the following form: { "prompt": "What are the three most important things to consider when deciding what technology to use to build an assist device to help an elderly person with basic needs?", "response": "To build an assistive device to help an elderly person with basic needs, one must consider three crucial things:...",}

So we need to extract questions and answer pairs from our data files to be able to train LLM. The challenge will be would the generated Q&As covers all of our text data?