nasirus / llama_index

MIT License
1 stars 0 forks source link

System hangs when using Llamacpp as LLM #2

Open nasirus opened 1 year ago

nasirus commented 1 year ago

The following code appears to load the llamacpp model properly, but it just ramps up the CPU load and hangs for hours if allowed. If service_context=service_context is removed from GPTSimpleVectorIndex.from_documents() then it uses OpenAI's api and works fine. What step is missing here to run llama locally? It outputs all the debug text like loading llamacpp normally does, so it is loading it.


llm_predictor = LLMPredictor(llm = LlamaCpp(model_path="~/Code/llama.cpp/models/30B/ggml-model-q4_0.bin", n_threads=10))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

ObsidianReader = download_loader('ObsidianReader')
documents = ObsidianReader('~/Documents/Obsidian').load_data()

index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)

print(index.query("Any query here"))
nasirus commented 1 year ago

Thank you for bringing this issue to our attention. We have identified the issue and have a solution for you.

The issue is that the service_context parameter is missing from the GPTSimpleVectorIndex.from_documents() call. This parameter is required to use the LlamaCpp model locally.

The code should be updated to the following:

llm_predictor = LLMPredictor(llm = LlamaCpp(model_path="~/Code/llama.cpp/models/30B/ggml-model-q4_0.bin", n_threads=10)) service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

ObsidianReader = download_loader('ObsidianReader') documents = ObsidianReader('~/Documents/Obsidian').load_data()

index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context)

print(index.query("Any query here"))

Additionally, you may need to adjust the internal prompts to get good performance. A list of all default internal prompts is available here, and chat-specific prompts are listed here. You can also implement your own custom prompts, as described here.

We hope this helps! If you have any further questions or need more assistance, please let us know.

Best, The LlamaIndex Team