Low relevance for Doc().query answer

Snikch63200 commented 1 month ago

Hello,

I wonder why my Doc().query request often achieve random and often poor quality answers i term of relevance.

Papers are sometimes relevant sometimes not... Citations are, most of time, completly off-topic even if selected document contains relevant informations.

I use this config for LiteLLM model :

 local_llm_config = dict(
    model_list=[
        dict(
            model_name=f"openai/my-llm-model",
            litellm_params=dict(
                model=f"openai/my-llm-model",
                api_base="http://192.168.1.15:8080/v1/",
                api_key="sk-no-key-required",
                verbose=3,
                temperature=0.1,
                frequency_penalty=1.5,
                max_tokens=2048,
            ),
        )
    ]
)

With these answer parameters :

settings.answer.answer_max_sources = 4
settings.answer.evidence_k = 10
settings.answer.evidence_retrieval = True

I use a SparseEmbedding model as follows to run a full local PaperQA instance :

embedding_model = SparseEmbeddingModel(ndim=4096)

Is it a parameter problem (like number of tokens) or an embedding model problem (I read in code documentation sparse embedding is a very simple approach).

Best regards. @Dosu.

dosubot[bot] commented 1 month ago

The low relevance of your Doc().query results is likely due to the use of the SparseEmbeddingModel. This model is a simple keyword search model, which may not capture the semantic meaning of the text effectively, leading to off-topic citations and less relevant answers ^[1].

To improve relevance, consider using a HybridEmbeddingModel, which combines both sparse and dense embeddings. This approach can enhance the semantic understanding and improve the quality of the results ^[2].

Additionally, your LiteLLM configuration with a low temperature (0.1) and a high frequency penalty (1.5) might also impact the diversity and creativity of the generated answers. Adjusting these parameters could help in fine-tuning the response quality.

To continue talking to Dosu, mention @dosu.

Snikch63200 commented 1 month ago

I use 'mixtral-8x7b-instruct-v0.1.Q4_K_M' as local LLM model. Can I use this model as embedding model or embedding models are specific an have to be run separately ?

What are are the recommended values for temperature and frequency penality to increase 'mixtral-8x7b-instruct-v0.1.Q4_K_M' performance ?

@Dosu.

dosubot[bot] commented 1 month ago

I couldn't find specific information on using 'mixtral-8x7b-instruct-v0.1.Q4_K_M' as an embedding model in the PaperQA framework. Typically, embedding models are specialized and may need to be run separately from LLMs.

Regarding the recommended values for temperature and frequency penalty for 'mixtral-8x7b-instruct-v0.1.Q4_K_M', I couldn't find specific recommendations. You might need to experiment with different settings to optimize performance, starting with the default values used in your current configuration.

To continue talking to Dosu, mention @dosu.

Snikch63200 commented 1 month ago

How can I define a LiteLLM embedding model for a local embedding model ? @Dosu.

dosubot[bot] commented 1 month ago

To define a LiteLLM embedding model for a local embedding model, you can use the LiteLLMEmbeddingModel class. Here's an example:

from paperqa.llms import LiteLLMEmbeddingModel
from limits import RateLimitItemPerSecond

# Define the LiteLLM embedding model with rate limits
embedding_model = LiteLLMEmbeddingModel(
    name="your-local-model-name",
    config={"rate_limit": RateLimitItemPerSecond(20, 5)}
)

You can customize the configuration by passing additional keyword arguments, such as rate limits:

embedding_model = LiteLLMEmbeddingModel(
    name="your-local-model-name",
    config={"rate_limit": RateLimitItemPerSecond(1_000_000, 1)}
)

Alternatively, you can use the from_name method:

embedding_model = LiteLLMEmbeddingModel.from_name("your-local-model-name")

This will create a LiteLLMEmbeddingModel with the specified name and configuration ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

Snikch63200 commented 1 month ago

I defined my local embedding model as follows :

embedding_model = HybridEmbeddingModel([LiteLLMEmbeddingModel.from_name(f"openai/my-llm-model"),
                                            SparseEmbeddingModel(ndim=4096)])

Got this error :

Traceback (most recent call last):
  File "C:\Users\XXXXXXXXXXX\Desktop\PaperQA\paper_qa_question_async.py", line 85, in <module>
    asyncio.run(my_query())
  File "C:\ProgramData\anaconda3\envs\PaperQA_env\Lib\asyncio\runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\PaperQA_env\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\PaperQA_env\Lib\asyncio\base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "C:\Users\XXXXXXXXXXX\Desktop\PaperQA\paper_qa_question_async.py", line 66, in my_query
    embedding_model = HybridEmbeddingModel([LiteLLMEmbeddingModel.from_name(f"openai/my-llm-model"),
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\ProgramData\anaconda3\envs\PaperQA_env\Lib\site-packages\pydantic\_internal\_model_construction.py", line 262, in __getattr__
    raise AttributeError(item)
AttributeError: from_name. Did you mean: 'from_orm'?

Any solution ? @Dosa.

Future-House / paper-qa

Low relevance for Doc().query answer #557