Oneflow-Inc / libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
https://libai.readthedocs.io
Apache License 2.0
389 stars 55 forks source link

GLM-libai #443

Closed xiezipeng-ML closed 1 year ago

xiezipeng-ML commented 1 year ago
xiezipeng-ML commented 1 year ago

note: glm的position-embeddinglibai下的略有区别, glm应该是参考tensorflow中的代码, 是通过将sin和cos直接cat, libai的SinePositionalEmbedding是比较普遍的实现方式, 将sin和cos穿插起来, 符合position_embedding的原理, github也有人对tensorflow写法的疑问:https://github.com/tensorflow/tensor2tensor/issues/391

image

xiezipeng-ML commented 1 year ago

import numpy as np import oneflow as flow from omegaconf import DictConfig

from libai.utils import distributed as dist from projects.GLM.configs.glm import cfg from projects.GLM.modeling_glm import GLMForConditionalGeneration from projects.GLM.utils.glm_loader import GLMLoaderHuggerFace

se = DictConfig( dict( data_parallel_size=1, tensor_parallel_size=2, pipeline_parallel_size=1, ) )

dist.setup_dist_util(se)

a = np.load("/home/xiezipeng/libai/projects/GLM/data/a.npy") b = np.load("/home/xiezipeng/libai/projects/GLM/data/b.npy") c = np.load("/home/xiezipeng/libai/projects/GLM/data/c.npy")

a = flow.tensor( a, sbp=dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.broadcast]), placement=dist.get_layer_placement(0), ) b = flow.tensor( b, sbp=dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.broadcast]), placement=dist.get_layer_placement(0), ) c = flow.tensor( c, sbp=dist.get_nd_sbp([flow.sbp.broadcast, flow.sbp.broadcast]), placement=dist.get_layer_placement(0), )

loader = GLMLoaderHuggerFace( GLMForConditionalGeneration, cfg, "/home/xiezipeng/.cache/huggingface/hub/models--BAAI--glm-10b/snapshots/a1af0020e768c654c1d3921673aa91b2380f2476", ) model = loader.load() outputs = model.generate(inputs=a, position_ids=b, generation_attention_mask=c) print(outputs)