Abhi5h3k / PrivateDocBot

📚 Local PDF-Integrated Chat Bot: Secure Conversations and Document Assistance with LLM-Powered Privacy
Apache License 2.0
70 stars 16 forks source link

Loading Mistral model. #1

Closed KaSakee closed 11 months ago

KaSakee commented 1 year ago

I attempted to load the Mistral Open Orca model, but I couldn't, so I changed the code a little, but now I receive an answer if I ask the same query again, so the first time it gives no answer, and the second time it responds, maybe from the cache. This is all new to me.

Please let me know if there is a method to import the "mistral-7b-openorca.Q4_K_M.gguf" model without editing.

My edit:

def build_llm(model):
    # Call the function to get the number of threads to use
    num_threads = determine_threads_to_use()
    print(f"Number of Threads avaialble : {num_threads}")

    llm = LlamaCpp(
        model_path=cfg.MODEL_BIN_DIR + "/" + model,
        n_gpu_layers=num_threads,
        # f16_kv=True,
        n_batch=8192 / 4,
        n_ctx=8192,
        max_tokens=cfg.MAX_NEW_TOKENS,
        n_threads=num_threads,
        temperature=cfg.TEMPERATURE,
        streaming=True,
        repeat_penalty= 1.3,
        callbacks=[StreamingStdOutCallbackHandler()],
    )

    # Local CTransformers model
    # llm = CTransformers(
    #     model=cfg.MODEL_BIN_DIR + "/" + model,
    #     model_type="llama",
    #     config={
    #         "max_new_tokens": cfg.MAX_NEW_TOKENS,
    #         "temperature": cfg.TEMPERATURE,
    #         "threads": num_threads,
    #         "stream": True,
    #         "repetition_penalty": 1.3,
    #     },
    #     callbacks=[StreamingStdOutCallbackHandler()],
    # )

    return llm

If I use the default code, the error is:

RuntimeError: Failed to create LLM 'llama' from 'models/mistral-7b-openorca.Q4_K_M.gguf'.

No matter what model type I change.

Abhi5h3k commented 11 months ago

Apologies, Just saw your issue.

Tested: mistral-7b-openorca.Q4_K_M.gguf

Just had to update one package.

ctransformers Now Support GGUF format support for Llama and Falcon models.

gguf

Try with latest code :)