OpenBMB / ModelCenter

Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
https://modelcenter.readthedocs.io
Apache License 2.0
233 stars 28 forks source link

[FEATURE] support model.from_pretrained without the need of init distributed #20

Open Jiaxin-Wen opened 2 years ago

Jiaxin-Wen commented 2 years ago
from model_center.layer import CPM1
CPM1.from_pretrained("cpm1-large")

currently could not work since the function check_web_and_convert_path calls bmt.rank() or bmt.print_rank() to prevent every process downloads the checkpoint in a multi-gpu scenario.

While ModelCenter is mainly designed to support distributed training, I think it is still important to support such a common code snippet.