databricks / dbrx

Code examples and resources for DBRX, a large language model developed by Databricks
https://www.databricks.com/
Other
2.5k stars 236 forks source link

Real Performance versus llama-70B? #10

Closed JadeRay closed 6 months ago

JadeRay commented 6 months ago

I have a problem about the inference data posted in this blog: https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

A MoE model with 36B activated parameters and 132B total parameters, it's inference performance will act like a 90B dense model with 2000 prompt and 256 output tokenes. How can it always performs better than llama2-70B dense model? As the batchsize increases, it will perform better than llama2-70B dense model first, and will perform worse than llama2-70B dense model from batchszie 3 or 4, because it will load all the 132B parameters when more and more experts are activated.

image

dskhudia commented 6 months ago

@JadeRay : The benchmarking methodology is explained here: https://github.com/databricks/dbrx/issues/9#issuecomment-2025688421 Let us continue the discussion there.

Closing this in favor of https://github.com/databricks/dbrx/issues/9