matdombrock / sllm

A command line interface for the OpenAI LLMs.
https://www.npmjs.com/package/sllm
GNU General Public License v3.0
39 stars 3 forks source link

Please add llama-index functionality to expand the possible sizes of files. #3

Open ctrain79 opened 1 year ago

ctrain79 commented 1 year ago

This seems like an awesome start to command-line LLM integration. Please add llama-index functionality, as it would enhance the usefulness of your project to help deal with larger file sizes than the 4K limit (or other limits as GPT has newer versions).

matdombrock commented 1 year ago

@ctrain79 Can you give me an example of the indexing functionality? I'm not sure if I'm familiar with that.

PRs are also welcome.

ctrain79 commented 1 year ago

It's not too bad with their docs:

https://gpt-index.readthedocs.io/en/latest/index.html

One of the simplest index type that gets used for starters is vector:

You will need your OpenAI access key set up with environment variable OPENAI_API_KEY.

from llama_index import GPTSimpleVectorIndex, SimpleDirectoryReader, LLMPredictor
from langchain import OpenAI

llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo"))

documents = SimpleDirectoryReader('data').load_data()
index = GPTSimpleVectorIndex(documents, llm_predictor=llm_predictor)

index.save_to_disk('index.json');

Then you can use the index again later (or just use it in the same script):

from llama_index import GPTSimpleVectorIndex

index = GPTSimpleVectorIndex.load_from_disk('index.json')

response = index.query("Hello, ChatGPT!")
print(response)
ctrain79 commented 1 year ago

Just install llama-index with pip. There are a couple others python packages depending on which document file formats you want to load. The errors will tell you which ones when you need them.

But I think now there is the OpenAI plugins that are going to do similar things---give it access to your documents. I kind of like that I could keep the index on my own computer, though.

matdombrock commented 1 year ago

@ctrain79 Is this something you would be interested in helping with or making a PR for?

Ideally this project does not take on any python dependencies.

Maybe we can find a way to make this an optional thing where there is flag for the command that would spawn a python process and generate a the index.

ctrain79 commented 1 year ago

Spawning python and using llama-index would be fastest way to get it set up, and then if you want to keep everything typescripts I could implement some. That would take me a while and there are many choices because so many different data structures and ways to perform search are possible. I've only done light reading on it so far, but will be reading more on various measures for search. The llama-index recommends people start with their simplest, which is vector index, so I would likely start there.

The thing to worry about with indices is that queries become more expensive and thus optimizations provide less cost for similar result. So, choice of data structure and search algorithm have trade-offs between expense and accuracies like recall or similarity.

This might not be necessary with newer models, say with ChatGPT's larger prompt. I believe OpenAI will be planning to release the ability to prompt with about 1-million tokens. But again, an index would provide cheaper alternative.

matdombrock commented 1 year ago

@ctrain79 This is a really cool idea and it would be super useful. I just don't have the time to dig into something like this right now myself.

That being said I would be curious to see a python solution like you mentioned working and I would be willing to accept any PRs that make python an optional dependency (to get this feature) or any PRs which use Typescript.

This might not be necessary with newer models, say with ChatGPT's larger prompt. I believe OpenAI will be planning to release the ability to prompt with about 1-million tokens. But again, an index would provide cheaper alternative.

From what I understand, an index like this will always be faster and cheaper than sending a full document. So even if we get much larger contexts to play with I think this would still be really nice to have.