LostRuins / koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.78k stars 339 forks source link

[BUG] (v1.4.3) -> Model unknown, cannot load -> Airoboros c34b and Remm-Lion #424

Open SabinStargem opened 1 year ago

SabinStargem commented 1 year ago

On older versions of KoboldCPP, I have been able to use Airoboros c34b. That is no longer the case. From the looks of it, Kobold is seeing it as Llama v0?


Welcome to KoboldCpp - Version 1.43 For command line arguments, please refer to --help


Attempting to use CuBLAS library for faster prompt ingestion. A compatible CuBLAS will be required. Initializing dynamic library: koboldcpp_cublas.dll

Overriding thread count, using 6 threads instead. Namespace(bantokens=None, blasbatchsize=2048, blasthreads=6, config=None, contextsize=16384, debugmode=False, forceversion=0, gpulayers=0, highpriority=False, hordeconfig=None, host='', launch=True, lora=None, model=None, model_param='C:/KoboldCPP/Models/Airoboros v2.1.6 - L2 c34b q6k.kcpps', noavx2=False, noblas=False, nommap=False, port=5001, port_param=5001, psutil_set_threads=True, ropeconfig=[0.0, 10000.0], skiplauncher=False, smartcontext=False, stream=False, tensor_split=None, threads=6, unbantokens=False, useclblast=None, usecublas=['normal', '0', 'mmq'], usemirostat=None, usemlock=True)

Loading model: C:\KoboldCPP\Models\Airoboros v2.1.6 - L2 c34b q6k.kcpps [Threads: 6, BlasThreads: 6, SmartContext: False]


Identified as LLAMA model: (ver 0) Attempting to Load...

Using automatic RoPE scaling (scale:1.000, base:200000.0) System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |

Unknown Model, cannot load. Load Model OK: False Could not load model: C:\KoboldCPP\Models\Airoboros v2.1.6 - L2 c34b q6k.kcpps

....Okay, another model is also like this, Remm-Lion.


Welcome to KoboldCpp - Version 1.43 For command line arguments, please refer to --help


Attempting to use CuBLAS library for faster prompt ingestion. A compatible CuBLAS will be required. Initializing dynamic library: koboldcpp_cublas.dll

Overriding thread count, using 6 threads instead. Namespace(bantokens=None, blasbatchsize=2048, blasthreads=6, config=None, contextsize=16384, debugmode=False, forceversion=0, gpulayers=15, highpriority=False, hordeconfig=None, host='', launch=True, lora=None, model=None, model_param='C:/KoboldCPP/Models/ReMM-Lion 13b q6.kcpps', noavx2=False, noblas=False, nommap=False, port=5001, port_param=5001, psutil_set_threads=True, ropeconfig=[0.0, 10000.0], skiplauncher=False, smartcontext=False, stream=False, tensor_split=None, threads=6, unbantokens=False, useclblast=None, usecublas=['normal', '0', 'mmq'], usemirostat=None, usemlock=True)

Loading model: C:\KoboldCPP\Models\ReMM-Lion 13b q6.kcpps [Threads: 6, BlasThreads: 6, SmartContext: False]


Identified as LLAMA model: (ver 0) Attempting to Load...

Using automatic RoPE scaling (scale:1.000, base:200000.0) System Info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |

Unknown Model, cannot load. Load Model OK: False Could not load model: C:\KoboldCPP\Models\ReMM-Lion 13b q6.kcpps

[process exited with code 3 (0x00000003)]

I will just list the models I have tried with success or failure at this point. Y = good, N = bad.

Y - Airoboros v2.1 13b N - Airoboros v2.1 c34b Y - Airoboros v2.1 Creative 70b Y - Kimiko v2 70b Y - MLewd Y - Mythalion N - Remm-Lion Y - Remm-SLERP Y - Vicuna v1.5 16k

LostRuins commented 1 year ago

Ermm... you are trying to load a kcpps file? That's a text settings file. Shouldn't you be loading a GGUF or .bin file?

SabinStargem commented 1 year ago

...You are correct. :P

It would be cool if the model selector only shown compatible files.

image