XiaoBin1992 / clover

Official Implementation of Clover-1 and Clover-2
Apache License 2.0
3 stars 0 forks source link

baichuan model support #6

Open Lucas-TY opened 2 months ago

Lucas-TY commented 2 months ago

May I ask would Baichuan model compatible with Clover?

Could you please give some intruction for how to train the draft model for baichuan model?

XiaoBin1992 commented 2 months ago

May I ask would Baichuan model compatible with Clover?

Could you please give some intruction for how to train the draft model for baichuan model?

Clover certainly supports the Baichuan series models, and many experiments in Clover originate from the BC2 7b model. The training of the Clover draft model for the BC model is consistent with other models, with the following points to note:

  1. During initialization, the lm_head weight of the BC model is normalized, corresponding to the emb lookup part of the Clover draft model.
  2. Try to use the SFT dataset to train the draft model, the larger the data volume and richness, the better. A larger batch size can accelerate convergence, and correspondingly, fewer epochs can be used.