gotzmann / llama.go

llama.go is like llama.cpp in pure Golang!
Other
1.24k stars 63 forks source link

Trying to run thebloke guanaco-3b-uncensored-v2.ggmlv1.q4_0.bin but got this error, how to convert to format for llama.go? #24

Open hiqsociety opened 11 months ago

hiqsociety commented 11 months ago
./llama-go-v1.4.0-linux --model=guanaco-3b-uncensored-v2.ggmlv1.q4_0.bin --prompt="write a story about alibaba and snow white"

  /▒▒       /▒▒         /▒▒▒/▒▒▒   /▒▒/▒▒▒▒/▒▒   /▒▒▒/▒▒▒      /▒▒▒▒/▒▒   /▒▒▒/▒▒▒    
  /▒▒▒      /▒▒▒      /▒▒▒/ /▒▒▒ /▒▒▒/▒▒▒▒/▒▒▒ /▒▒▒/ /▒▒▒     /▒▒▒▒ //   /▒▒▒▒//▒▒▒  
  /▒▒▒▒/▒▒  /▒▒▒▒/▒▒  /▒▒▒▒/▒▒▒▒ /▒▒▒/▒▒▒▒/▒▒▒ /▒▒▒▒/▒▒▒▒ /▒▒ /▒▒▒▒/▒▒▒▒ /▒▒▒ /▒▒▒▒ 
  /▒▒▒▒/▒▒▒ /▒▒▒▒/▒▒▒ /▒▒▒ /▒▒▒▒ /▒▒▒//▒▒ /▒▒▒ /▒▒▒ /▒▒▒▒ /▒▒▒//▒▒▒▒/▒▒  //▒▒▒/▒▒▒
  //// ///  //// ///  ///  ////  ///  //  ///  ///  ////  ///  //// //    /// ///

   ▒▒▒▒ [ LLaMA.go v1.4.0 ] [ LLaMA GPT in pure Golang - based on LLaMA C++ ] ▒▒▒▒

[ERROR] Invalid model file 'guanaco-3b-uncensored-v2.ggmlv1.q4_0.bin'! Too old, regenerate!
[ ERROR ] Failed to load model "guanaco-3b-uncensored-v2.ggmlv1.q4_0.bin"
drunlade commented 11 months ago

The error is telling you what the issue is. The file format is too old. That's the old .bin version of the file, you probably need the .gguf version.

EDIT: I say that, but I don't see a commit adding gguf support. Perhaps you need a newer ggml version? Either way, the error is relatively self explanatory. See if you can find a newer version of the .bin file perhaps, assuming gguf support wasn't added.

hiqsociety commented 11 months ago

@drunlade mind showing how to get a newer version of the bin file from thebloke or otherwise? otherwise, possible to give a direction on how to convert "the default format" to one for use with llama.go?

yes, there's no gguf support it seems so would appreciate if u can show the way to one for 3b that works

drunlade commented 11 months ago

@hiqsociety I can only suggest browsing TheBloke's list of models on the HuggingFace website and see what the most recent non-gguf version is.