pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

[example] Add support for DBRX #174

Open yanboliang opened 1 month ago

yanboliang commented 1 month ago

Support dbrx from databricks. Initial perf numbers:

|                  |   1 GPU |    2 GPU  | 4 GPU  |    8 GPU   |
|------------------|---------|-----------|--------|------------|
|baseline(bfloat16)|    OOM  |     OOM   | 59.53  |  100.51    |
|        int8      |    OOM  |    66.72  | 91.21  |  146.86    |