Vocab parallel Embedding impl and make example work when tp_size > 1 - Githubissues

OpenBMB / BMTrain

Efficient Training (including pre-training and fine-tuning) for Big Models

Apache License 2.0

560 stars 77 forks source link

Vocab parallel Embedding impl and make example work when tp_size > 1 #186

Closed MayDomine closed 8 months ago

MayDomine commented 8 months ago

Vocab Parallel Embedding implementation

Description

Provide Vocab Parallel Embedding for TP-Parallel in bmt.nn.parallel_embedding.

Type of Change

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce.

Checklist

[x] I have read the CONTRIBUTING document.
[x] My code follows the code style of this project.
[x] My change requires a change to the documentation.
[ ] I have updated the documentation accordingly.
[ ] I have added tests to cover my changes.
[ ] All new and existing tests passed.