KoboldAI / KoboldAI-Client

For GGUF support, see KoboldCPP: https://github.com/LostRuins/koboldcpp
https://koboldai.com
GNU Affero General Public License v3.0
3.46k stars 747 forks source link

MosaicML's MPT-7B models #334

Open ghost opened 1 year ago

ghost commented 1 year ago

Could support be added for these? Their context token size is way bigger than the current models. It's 65536 instead of the 2048 so it retains memory way better. Here is more info about them: https://www.mosaicml.com/blog/mpt-7b

There is also atleast one quantized model already: https://huggingface.co/4bit/mpt-7b-storywriter-4bit-128g

I'm more intrested in getting that quantized model to run though due to the way lover VRAM requirement.

Vadim-Karpenko commented 1 year ago

+up. The 5.0/5.1/8 bit versions are out as well: https://huggingface.co/TheBloke/MPT-7B-Storywriter-GGML