Open HaoruSung opened 9 months ago
We don't currently have a config for it but supporting it should be as simple as adding one. It's Mistral-7B based, which we support (without sliding window attention): https://github.com/Lightning-AI/lit-gpt/blob/main/lit_gpt/config.py#L1177-L1191
Thank you for your reply, does it look like this after adding? config.py
##########################
# HuggingFaceH4 zephyr-7b-beta
##########################
zephyr = [
# https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/config.json
dict(
name="zephyr-7b-beta",
hf_config=dict(org="HuggingFaceH4", name="zephyr-7b-beta"),
padded_vocab_size=32000,
block_size=4096, # should be 32768 but sliding window attention is not implemented
n_layer=32,
n_query_groups=8,
rotary_percentage=1.0,
parallel_residual=False,
bias=False,
_norm_class="RMSNorm",
norm_eps=1e-05,
_mlp_class="LLaMAMLP",
intermediate_size=14336,
)
]
configs.extend(zephyr)
In addition, do I need to use the zephyr input format when preparing the data? prepare_my_data.py
def generate_prompt(example: dict) -> str:
"""Generates a standardized message to prompt the model with an instruction, optional input and a
'response' field."""
if example["input"]:
return (
f"<|system|>\n{example['instruction']}</s>\n<|user|>\n{example['input']}</s>\n<|assistant|>"
)
return (
f"<|system|>\n{example['instruction']}</s>\n<|assistant|>"
)
Is there anything I've missed that I need to modify? Thank you very much for your help!
I recently came across a model named Zephyr-7b-beta, which seems to be a great fit for my current needs. I am wondering if Lit-gpt currently supports this model? Thank you so much!