princeton-nlp / LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
https://arxiv.org/abs/2310.06694
MIT License
548 stars 43 forks source link

composer model trans to pythia problem #64

Open rzr002 opened 7 months ago

rzr002 commented 7 months ago

I've encountered the same issue; I tried to modify the Pythia model to a Composer model by imitating your llama rewrite, and although it runs successfully, the output logits are quite different. Could you please expedite the updates for Pythia or Mistral?