Albert model aware - Githubissues

Tencent / TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Other

1.49k stars 198 forks source link

Closed feifeibear closed 4 years ago

feifeibear commented 4 years ago

Albert model uses the model-aware memory allocator.