Comparison of capabilities of LLMs with the same parameter scale.

mobiusml / low-rank-llama2

Low-Rank Llama Custom Training

https://mobiusml.github.io/low-rank-llama2/

13 stars 0 forks source link

Comparison of capabilities of LLMs with the same parameter scale. #1

Open jianyuheng opened 3 weeks ago

jianyuheng commented 3 weeks ago

For example, the capabilities of 7b and 13b-lowrank.

mobicham commented 3 weeks ago

Hi, can you please clarify what you mean? What is "parameter scale" and what is the issue exactly?

jianyuheng commented 3 weeks ago

I mean13b-lowrank equivalents to 6.5b, Have you compared ppl between 13b-lowrank and 7b?

mobicham commented 3 weeks ago

Ah I understand. No, we didn't. Our goal at that time was to get 3B version of the Llama2-7B chat model. Phi2 base was released like 5-3 months after this low-rank work. We trained our own instruct version instead which is better than a Llama2-7B chat for half the size: https://huggingface.co/mobiuslabsgmbh/aanaphi2-v0.1 and that achieved our goal instead.