Closed logancyang closed 7 months ago
I support your idea. Use LLM to read the question, let it generate search keywords, then call the full-text search function of obsidian, and finally use word embedding search in the search results. I think this is an effective solution.
I think it is necessary to support obsidian's built-in search query. Only in this way can the two systems of LLM and PKM be connected.
I have an idea. That is, do not send all the fragments directly to LLM. Instead, let LLM score each fragment according to its relevance to the question, and finally use the highly relevant fragments to answer the question. In this way we can weaken irrelevant corpus and use more corpus. I don't know if this is such a workflow now.
I have an idea. That is, do not send all the fragments directly to LLM. Instead, let LLM score each fragment according to its relevance to the question, and finally use the highly relevant fragments to answer the question. In this way we can weaken irrelevant corpus and use more corpus. I don't know if this is such a workflow now.
This is actually a standard step in RAG called the LLM reranker. It's a good idea BUT it relies on predictable LLM behavior. If this plugin works by calling my backend where I set all the params for the users, there's not much of a problem. But this plugin is completely local with params set on the user side, which means someone may be using a <3B local LLM that's not fit for this task to run the reranking.
In short, albeit a good idea, I'd like to keep the moving parts minimal to avoid people using it the wrong way.
At present, retrieving problem-related content from valut is a key issue, at least Gemini pro does not perform very well in this regard. You mean this standard step is not performed when using local model?
@wwjCMP When you say Gemini pro does not perform very well, do you mean when irrelevant notes are retrieved, Gemini pro chat model has a low answer quality?
If I can use a fixed reranker of which the quality I can trust, reranking is definitely going to help a ton. Probably I should look into Cohere's new offerings like cmd R and stuff.
@wwjCMP When you say Gemini pro does not perform very well, do you mean when irrelevant notes are retrieved, Gemini pro chat model has a low answer quality?
If I can use a fixed reranker of which the quality I can trust, reranking is definitely going to help a ton. Probably I should look into Cohere's new offerings like cmd R and stuff.
It mostly refuses to answer questions because it believes that the content provided is irrelevant to the question。
I have found that when using LLM to explore our vault, excluding irrelevant information is more effective than providing relevant information
@wwjCMP Great observation.
In the near term I'm going to make the "similarity threshold" a user setting so you can tune it up to exclude irrelevant docs. Since for different embedding models the threshold can be different, it will need the user to experiment a bit.
This is the downside of customizability - I don't have much control over how the user wants to use it, hence it requires more know-how from the user. If instead, I provide a fixed setting and fixed provider, I can control how it works much better.
@wwjCMP Great observation.
In the near term I'm going to make the "similarity threshold" a user setting so you can tune it up to exclude irrelevant docs. Since for different embedding models the threshold can be different, it will need the user to experiment a bit.
This is the downside of customizability - I don't have much control over how the user wants to use it, hence it requires more know-how from the user. If instead, I provide a fixed setting and fixed provider, I can control how it works much better.
https://github.com/logancyang/obsidian-copilot/releases/tag/2.5.2
Now we can specify note in vault QA through [[ ]]
, I wonder if we can exclude note through [[ ]]
?
@wwjCMP When you say Gemini pro does not perform very well, do you mean when irrelevant notes are retrieved, Gemini pro chat model has a low answer quality?
If I can use a fixed reranker of which the quality I can trust, reranking is definitely going to help a ton. Probably I should look into Cohere's new offerings like cmd R and stuff.
Is it possible to use a local rerank model? I'm not sure if ollama currently supports rerank models.
Right now if you ask the AI to do something with a
[[note title]]
directly in Vault QA, it does not work every time. This is because it uses embedding search.The solution is to parse the user message for the note title and do a full-text search.
Next step: