Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations
Apache License 2.0
6.06k stars 571 forks source link

`litellm_get_search_query` does not use a local llm config #568

Open migalkin opened 4 hours ago

migalkin commented 4 hours ago

In the standard ask pipeline, the function litellm_get_search_query does not employ custom llm settings config available in query.settings.llm_config:

https://github.com/Future-House/paper-qa/blob/e49f1af67e5463ee5188cbe20bdb9747f14db1f9/paperqa/agents/main.py#L178-L180

This leads to creating a default LiteLLMModel instance which by default calls some public endpoint (eg, OpenAI one)

https://github.com/Future-House/paper-qa/blob/e49f1af67e5463ee5188cbe20bdb9747f14db1f9/paperqa/agents/helpers.py#L54-L55

I found this behavior after setting all llm_config, summary_llm_config and agent_llm_config pointing to a custom (local) vLLM endpoint, but PaperQA always calls some hardcoded open endpoint at this step and fails because those endpoints don't recognize custom model names.

I managed to solve this by patching litellm_get_search_query to accept a config dict and using it in the LiteLLMModel initialization:

for search in await litellm_get_search_query(
        question, llm=query.settings.llm, count=3, config=query.settings.llm_config
    ):

and

model = LiteLLMModel(name=llm, config=config)

Is this a bug (I can submit a PR if you'd like) or a feature of the fake agent? šŸ¤”

dosubot[bot] commented 4 hours ago

Hey @migalkin! I'm here to help you with any bugs, questions, or contributions. Let's tackle this together!

The issue you've identified with litellm_get_search_query not utilizing custom LLM settings from query.settings.llm_config does seem like a bug. The function defaults to a public endpoint, which can cause problems when custom model names are used. Your proposed patch to allow the function to accept a configuration dictionary and initialize LiteLLMModel with it is a valid solution. Submitting a PR to address this would likely be beneficial for users who need custom configurations.

To continue talking to Dosu, mention @dosu.

jamesbraza commented 2 hours ago

Hi @migalkin thanks for the issue and nice spot, I follow what you're saying. Feel free to open a PR for this šŸ‘

Looks like the same thing happens in AnswerResponse.get_summary.

Let's call this a bug because we advertise as working for local LLMs