Facico / Chinese-Vicuna

Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案,结构参考alpaca
https://github.com/Facico/Chinese-Vicuna
Apache License 2.0
4.14k stars 421 forks source link

为什么我训练的lora无论是启动、还是推理的速度都很慢 #181

Closed greatewei closed 1 year ago

greatewei commented 1 year ago

我的指令数据: 69k 基座 llama-13b: 微调参数:

MICRO_BATCH_SIZE = 4  # this could actually be 5 but i like powers of 2
BATCH_SIZE = 64
MAX_STEPS = None
GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
EPOCHS = 3  # we don't always need 3 tbh
LEARNING_RATE = 3e-4  # the Karpathy constant
CUTOFF_LEN = 2048  # 256 accounts for about 96% of the data
LORA_R = 8
LORA_ALPHA = 16
LORA_DROPOUT = 0.05
VAL_SET_SIZE = args.test_size  # 2000
TARGET_MODULES = [
    "q_proj",
    "v_proj",
    "k_proj",
    "o_proj",
    "down_proj",
    "gate_proj",
    "up_proj",
]

启动可能将近5分钟、推荐过程中总是卡在最后的输出,过一会才能输出最后的结果。