aitsc / GLMKD

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
MIT License
30 stars 1 forks source link

Compatibility with chatGLM2 and chatGLM3 #2

Open Ishiki-Iroha opened 6 months ago

Ishiki-Iroha commented 6 months ago

Can I use the chatGLM2 and chatGLM3 models to conduct experiments? Are there any modifications required? Thank you very much.

aitsc commented 6 months ago

The original GLM and ChatGLM models have some differences in their structures that require modifications.