I want to add my PDF also in GGUF model and serve via llama-cpp-server

ggerganov / llama.cpp

LLM inference in C/C++

MIT License

64.7k stars 9.27k forks source link

I want to add my PDF also in GGUF model and serve via llama-cpp-server #3741

Closed Prashantsaini25 closed 5 months ago

Prashantsaini25 commented 10 months ago

How do I add my pdf in that GGUF model or GGML model that auto-creates embeddings and serves via llama-cpp-server ./server -m ../models/llama-2-13b-chat.Q4_0.gguf -c 2048 --host 0.0.0.0 --port 8080

KerfuffleV2 commented 10 months ago

As far as I know, there's no support for PDFs in the main llama.cpp code. The server does have an embeddings mode (never used it myself). Basically you'd need to find/write something to handle the PDF stuff or convert your PDF to a text file.

The Poppler project has some tools for working with PDF files like pdftotext, pdftohtml (the quality of the results may vary since a lot of information is lost converting a PDF to text).

Prashantsaini25 commented 10 months ago

I want a 10k analysis with this Llamacpp server. and give input with the curl command

KerfuffleV2 commented 10 months ago

I want a 10k analysis

I don't really know what this means. If you're saying you want to give the model 10,000 tokens as far as I know llama-2-13b-chat only supports up to 4,096 unless you use special stuff like RoPE tricks.

monatis commented 10 months ago

The server only gives you the embeddings, and there's no vector search capability if this is what you're looking for. You can write your own script that sends passages to the /embedding endpoint of the server and index it in your choice of vector search server/library. I'm preparing to support nıre enbeddşng nıdeks and vector search capability directly in this project but no ETA for the vector search part atm (embedding models are on the way).

staviq commented 10 months ago

I believe what you want is RAG (retrieval augmented generation), and this is currently out of the scope of llama.cpp

You might want to try privateGPT, it was designed with this use case in mind, and it can directly "import" PDF documents: https://github.com/imartinez/privateGPT

github-actions[bot] commented 5 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.