Open ChettakattuA opened 9 months ago
The prompt template is indeed key in this case but it all depends on the LLM that you choose. Some are better than others at following instructions. For instance, GPT-4 tends to follow instructions much better than GPT-3.5. Moreover, you can use KeyBERT together with KeyLLM. There, the candidate keywords will be passed to KeyLLM and the LLM can still decide by itself which to keep and which to throw away.
So how is exactly KeyBERT+KeyLLM work? just prompting and asking llm to give as keywords from a list and it make use of weighting system to find the answer than the traditional QA we do like in chat gpt? Also, if we first use KeyBERT then imagine we got 5 relevant keywords, do we just give these keywords as candidates to KeyLLM?? Or how exactly the combination works?
So how is exactly KeyBERT+KeyLLM work? just prompting and asking llm to give as keywords from a list and it make use of weighting system to find the answer than the traditional QA we do like in chat gpt? Also, if we first use KeyBERT then imagine we got 5 relevant keywords, do we just give these keywords as candidates to KeyLLM?? Or how exactly the combination works?
You can find an extensive description on how it works here: https://maartengr.github.io/KeyBERT/guides/keyllm.html
I am using KeyLLM for a keyword extraction tool. I have already implemented the extraction tool with KeyBERT now since LLM seems to be a cool approach i would like to switch it to KeyLLM.
But the parameters of KeyLLm seems to be too controlled. For example we had many parameters like stopwords, diversity, topN etc in keyBERT which gave us a great control over the keywords we can get but KeyLLM lack trthese parameters.
but even if we mention it in prompt the the attempt seems to fail in many ways.
import openai from keybert.llm import OpenAI from keybert import KeyLLM
Create your LLM
openai.api_key = "sk-............." prompt = """ I have the following document: [DOCUMENT]
Display 10 terms that best describe the document. Extract terms only from document. Avoid the keywords in following list: RESIN, Grant Agreement