Finutune LLAMA 2 for large tables having 99columns

In general sending large tables to LLMs is not possible/feasible due to the context window. However, a strategy which you can and should use is, to use a vector store to store your columns including a short description for each column (you store embeddings thereof). Now you first retrieve the potentially helpful columns and send them to the LLM - only sending a subset of the huge table to the LLM.

This then also transfers to training: You don't train the LLM on the full table, but only on a subset.

While this is somewhat cumbersome, there is currently no other way, unfortunately.

meta-llama / llama

Finutune LLAMA 2 for large tables having 99columns #470