Open jianyuheng opened 3 weeks ago
Hi, can you please clarify what you mean? What is "parameter scale" and what is the issue exactly?
I mean13b-lowrank equivalents to 6.5b, Have you compared ppl between 13b-lowrank and 7b?
Ah I understand. No, we didn't. Our goal at that time was to get 3B version of the Llama2-7B chat model. Phi2 base was released like 5-3 months after this low-rank work. We trained our own instruct version instead which is better than a Llama2-7B chat for half the size: https://huggingface.co/mobiuslabsgmbh/aanaphi2-v0.1 and that achieved our goal instead.
For example, the capabilities of 7b and 13b-lowrank.