Oneflow-Inc / one-codegeex

Apache License 2.0
7 stars 1 forks source link

Manual cse pass #7

Closed BBuf closed 1 year ago

BBuf commented 1 year ago

master:

N_token_prompt: 127 Total generation time: 26.37485106801614 # Tokens: 897 0.02940340141361889s/token

pr:

N_token_prompt: 127 Total generation time: 25.027120330836624 # Tokens: 897 0.02790091452713113s/token

master:

图片

pr:

图片

可以看到Transformer模型里面对于mask的计算现在只在第一个self attention block有了,后续block都是共享第一个block的计算结果。