Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations
Apache License 2.0
6.47k stars 620 forks source link

Open Source LLM Server #378

Open DrachmaDroysen opened 2 months ago

DrachmaDroysen commented 2 months ago

Congrats on the recent release!

I was hoping to test PaperQA2 out with an open source LLM server, but I was struggling to get PaperQA2 to work with one. I have used ollama previously and I was able to set up llamafile for this, but I'm not sure what else is needed to get PaperQA2 to work with llamafile. Could someone please provide some guidance on this?

PaperQA2 uses an LLM to operate, so you'll need to either set an appropriate API key environment variable (i.e. export OPENAI_API_KEY=sk-...) or set up an open source LLM server (i.e. using llamafile. Any LiteLLM compatible model can be configured to use with PaperQA2.

CrispStrobe commented 2 months ago

The documentation already shows how to do it in python, but maybe you mean per cli? I just tried it with: pqa --llm "ollama/cas/llama3.1-8b-spaetzle-v74" --llm_config '{"model_list": [{"model_name": "ollama/cas/llama3.1-8b-spaetzle-v74", "litellm_params": {"model": "ollama/cas/llama3.1-8b-spaetzle-v74", "api_base": "http://localhost:11434"}}]}' --summary_llm "ollama/cas/llama3.1-8b-spaetzle-v74" --summary_llm_config '{"model_list": [{"model_name": "ollama/cas/llama3.1-8b-spaetzle-v74", "litellm_params": {"model": "ollama/cas/llama3.1-8b-spaetzle-v74", "api_base": "http://localhost:11434"}}]}' ask 'What is...?'

dosubot[bot] commented 2 months ago

Hey @DrachmaDroysen! 👋 I'm a bot here to help you with bugs, questions, and becoming a contributor. Let's tackle this together!

I found a similar unsolved discussion that might be relevant: Local LLM issue [1]. Additionally, there is a closed issue that might provide some insights: OpenAI API Alternative [2].

To get PaperQA2 to work with an open-source LLM server using llamafile, you can follow these steps:

  1. Download and execute the llamafile:

    • Download a llamafile.
    • Execute it with the following command to enable continuous batching and embeddings:
      llamafile -cb -np 4 -a my-llm-model --embedding
  2. Configure PaperQA2 to use the locally hosted LLM:

    • Use the following Python code to set up PaperQA2 with your local LLM server:

      from paperqa import Settings, ask
      
      local_llm_config = dict(
       model_list=dict(
           model_name="my_llm_model",
           litellm_params=dict(
               model="my-llm-model",
               api_base="http://localhost:8080/v1",
               api_key="sk-no-key-required",
               temperature=0.1,
               frequency_penalty=1.5,
               max_tokens=512,
           ),
       )
      )
      
      answer = ask(
       "What manufacturing challenges are unique to bispecific antibodies?",
       settings=Settings(
           llm="my-llm-model",
           llm_config=local_llm_config,
           summary_llm="my-llm-model",
           summary_llm_config=local_llm_config,
       ),
      )

This setup will configure PaperQA2 to use your locally hosted LLM server for both the main LLM tasks and the summary tasks [3].

To continue talking to Dosu, mention @dosu.