Open ghost opened 1 year ago
Could support be added for these? Their context token size is way bigger than the current models. It's 65536 instead of the 2048 so it retains memory way better. Here is more info about them: https://www.mosaicml.com/blog/mpt-7b
There is also atleast one quantized model already: https://huggingface.co/4bit/mpt-7b-storywriter-4bit-128g
I'm more intrested in getting that quantized model to run though due to the way lover VRAM requirement.
+up. The 5.0/5.1/8 bit versions are out as well: https://huggingface.co/TheBloke/MPT-7B-Storywriter-GGML
Could support be added for these? Their context token size is way bigger than the current models. It's 65536 instead of the 2048 so it retains memory way better. Here is more info about them: https://www.mosaicml.com/blog/mpt-7b
There is also atleast one quantized model already: https://huggingface.co/4bit/mpt-7b-storywriter-4bit-128g
I'm more intrested in getting that quantized model to run though due to the way lover VRAM requirement.