0cc4m / KoboldAI

GNU Affero General Public License v3.0
150 stars 30 forks source link

Error involving bfloat 16 on generation with MPT 7B 4-bit_128g #31

Open Bytemixer opened 1 year ago

Bytemixer commented 1 year ago

May 13, 2023, 5:57PM EST

With latestgptq branch, the MPT 7B model successfully loads on my RTX 3060 12GB, but on prompting to generate text, an error occurs involving bfloat16. Pastebin of error is below:

https://pastebin.com/8QQtXdmz

dicksensei69 commented 1 year ago

I don't know exactly why you are getting that and I haven't gotten it working yet but I do see that other have. Here is a video where the guy shows his settings for running the story writer model. Check out around the 4min mark. They have bf16 checked and some other little settings and it looks like it works.

https://www.youtube.com/watch?v=0wLG4eGuF_E

Bytemixer commented 1 year ago

I don't recognize that UI in the video and in Kobold settings and Silly Tavern I don't remember seeing any check boxes in settings indicating bfloat16. I don't use Ooba/WebUI and last I heard Ooba can't do MPT.

I'll take a closer look at the video after I get home later, but it doesn't look like something I'm using, so may not be relevant to my use case. And I'm not the only one with the issue.

Update: Took a closer look at the video. Yeah, I don't use Ooba. I purely run KoboldAI 4-bit and SillyTavern Dev build. I don't run Ooba/TexGenWebUI. The video isn't much help for my use case. Also, as Ooba is both backend and front end, and he's selecting MPT from a list, I wonder if he's even trying to run it locally in the video, or if he's doing it strictly through the WebUI so that the model is being run elsewhere or not locally on his own system. So I don't see how that video is relevant to KoboldAI 4-bit at all.