lxe / simple-llm-finetuner

Simple UI for LLM Model Finetuning
MIT License
2.05k stars 132 forks source link

Performance after FineTuning #47

Open Datta0 opened 1 year ago

Datta0 commented 1 year ago

I have fine tuned llama using this repo and a few text documents I had with me. If I provide 3-4 consecutive words from input text, it amazingly completes the next couple of sentences. But if I ask the same information as a question or reorder the input prompt, it hallucinates.

I thought I was overfitting and hence increased input data size, decreased the number of epochs which was neither completing the sentences when input as above nor answering the questions.

I also tried using vector embedding search and a model on top of it to put things together, but this way it is lacking information across few sentences. Also it can't answer anything other than What Where etc kind of questions if the answer it expected to span multiple sentences and its even worse when it has to infer something with this information and general knowledge. So that seems to be a not so fruitful approach

My goal is to get llama to have knowledge of a few text documents I have locally. Someone help me please.

lxe commented 1 year ago

I think finetuning on a small sample of docs is not the best way to have the model gain knowledge. Embeddings seem like a better approach (along with finetuning on samples representing a specific task, such as summarization).

21iridescent commented 1 year ago

I think finetuning on a small sample of docs is not the best way to have the model gain knowledge. Embeddings seem like a better approach (along with finetuning on samples representing a specific task, such as summarization).

May I ask how to use embedding? You mean search the relevant doc. by embedding and concat. them into prompt?

abhishekrai43 commented 1 year ago

I have fine tuned llama using this repo and a few text documents I had with me. If I provide 3-4 consecutive words from input text, it amazingly completes the next couple of sentences. But if I ask the same information as a question or reorder the input prompt, it hallucinates.

I thought I was overfitting and hence increased input data size, decreased the number of epochs which was neither completing the sentences when input as above nor answering the questions.

I also tried using vector embedding search and a model on top of it to put things together, but this way it is lacking information across few sentences. Also it can't answer anything other than What Where etc kind of questions if the answer it expected to span multiple sentences and its even worse when it has to infer something with this information and general knowledge. So that seems to be a not so fruitful approach

My goal is to get llama to have knowledge of a few text documents I have locally. Someone help me please.

use RAG. This whole fine-tuning business, as far as I think has very very few actual use cases. All these videos and tuts, finish by asking 1 or 2 questions from the dataset. I am not even sure is these methods actually output a useful model. As far as questionsing answering over docs is concerned RAG is simple and easy.