arcee-ai / DistillKit

An Open Source Toolkit For LLM Distillation
GNU Affero General Public License v3.0
297 stars 33 forks source link

RuntimeError: 'weight' must be 2-D #3

Open Hasan-Syed25 opened 1 month ago

Hasan-Syed25 commented 1 month ago

RuntimeError: 'weight' must be 2-D occurs when I am using Deepspeed Zero3 for distributed training. Is this an issue with deepspeed or is it an initialization issue. Here is link to the same issue that I am facing. What am I missing here?

Thanks

Crystalcareai commented 1 month ago

Deepspeed zero3 currently throws a lot of errors - we're working on it and will have a fix out soon.

fernando-neto-ai commented 1 month ago

@Hasan-Syed25 could you paste your env and your code, so I can deep dive on it?

zhangchushu commented 3 weeks ago

Same problem happens to me. Has anyone solved it?