ymcui / Chinese-Mixtral

中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
https://arxiv.org/abs/2403.01851
Apache License 2.0
582 stars 44 forks source link

Add training scripts #18

Closed iMountTai closed 8 months ago

iMountTai commented 8 months ago

Description

This PR adds scripts for PT and SFT. We use QLoRA to train Chinese-MixTrAL and adopt auxiliary loss to balance the expertise of each specialist.

Related Issue

13 #15