turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs
MIT License
3.53k stars 272 forks source link

Support GPT2 architecture #379

Closed iamwavecut closed 4 months ago

iamwavecut commented 6 months ago

As simple as that, please!

turboderp commented 4 months ago

Sorry for taking a while to respond. I really didn't see much of a point TBH, but then I added support for the new Granite models and it turns out they're basically GPT2 anyway.

So GPT2 is supported on the dev branch now, and from v0.0.21 when I'm ready to release.