Closed petrbrzek closed 1 year ago
IMO if factual answers restricted to a known set are needed, the only way to be sure is to use vector embedding hits, and then use the token generation prompt to make a summary only with given text, and temp=0.
So I've been doing a ton of research around this q&a use case.
From what I understand, to make it work, you'll need to generate a ton of training pairs in the form of question and answers on your data. You can do it manually or feed portions of text into an llm (ChatGPT) and generate a whole bunch of them.
I do feel like that ends up defeating the purpose of tuning an LLM for that.
I believe most tools that do this do it like this:
I know that in the case of Open AI fine-tuning it doesn't work by providing my own data and then the model can use it. Rather, it works by teaching it what style of language to use. So if I want GPT to use my data, I have to automatically have embeddings and a vector database and then put the relevant chunk of data back into the GPT prompt.
Is it similar here?