OpenBMB / BMTrain

Efficient Training (including pre-training and fine-tuning) for Big Models
Apache License 2.0
560 stars 77 forks source link

Vocab parallel Embedding impl and make example work when tp_size > 1 #186

Closed MayDomine closed 8 months ago

MayDomine commented 8 months ago

Vocab Parallel Embedding implementation

Description

Provide Vocab Parallel Embedding for TP-Parallel in bmt.nn.parallel_embedding.

Type of Change

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce.

Checklist