meta-llama / llama

Inference code for Llama models
Other
54.12k stars 9.32k forks source link

Finutune LLAMA 2 for large tables having 99columns #470

Open avineet123 opened 11 months ago

avineet123 commented 11 months ago

I am trying to finetune large tables having 99 columns and 180 rows for complex sql queries. I am unable to finetune it as it has 6000 tokens. Can we do that using LLAMA2?. Please assist.

andnig commented 10 months ago

In general sending large tables to LLMs is not possible/feasible due to the context window. However, a strategy which you can and should use is, to use a vector store to store your columns including a short description for each column (you store embeddings thereof). Now you first retrieve the potentially helpful columns and send them to the LLM - only sending a subset of the huge table to the LLM.

This then also transfers to training: You don't train the LLM on the full table, but only on a subset.

While this is somewhat cumbersome, there is currently no other way, unfortunately.