Loading Mistral model. - Githubissues

I attempted to load the Mistral Open Orca model, but I couldn't, so I changed the code a little, but now I receive an answer if I ask the same query again, so the first time it gives no answer, and the second time it responds, maybe from the cache. This is all new to me.

Please let me know if there is a method to import the "mistral-7b-openorca.Q4_K_M.gguf" model without editing.

My edit:

def build_llm(model):
    # Call the function to get the number of threads to use
    num_threads = determine_threads_to_use()
    print(f"Number of Threads avaialble : {num_threads}")

    llm = LlamaCpp(
        model_path=cfg.MODEL_BIN_DIR + "/" + model,
        n_gpu_layers=num_threads,
        # f16_kv=True,
        n_batch=8192 / 4,
        n_ctx=8192,
        max_tokens=cfg.MAX_NEW_TOKENS,
        n_threads=num_threads,
        temperature=cfg.TEMPERATURE,
        streaming=True,
        repeat_penalty= 1.3,
        callbacks=[StreamingStdOutCallbackHandler()],
    )

    # Local CTransformers model
    # llm = CTransformers(
    #     model=cfg.MODEL_BIN_DIR + "/" + model,
    #     model_type="llama",
    #     config={
    #         "max_new_tokens": cfg.MAX_NEW_TOKENS,
    #         "temperature": cfg.TEMPERATURE,
    #         "threads": num_threads,
    #         "stream": True,
    #         "repetition_penalty": 1.3,
    #     },
    #     callbacks=[StreamingStdOutCallbackHandler()],
    # )

    return llm

If I use the default code, the error is:

RuntimeError: Failed to create LLM 'llama' from 'models/mistral-7b-openorca.Q4_K_M.gguf'.

No matter what model type I change.

Abhi5h3k / PrivateDocBot

Loading Mistral model. #1