Closed edublancas closed 7 months ago
@edublancas quick update: I have a skeleton API running using FastAPI and Celery for the task queue. Working on porting over the functionality from the original chat-with-github example.
Some clarification: the original app defines an IndexLoader
which downloads and parses a repo into VectorStoreIndex
, one repo at a time. We want the new API to download/parse about multiple repos at once in the background, correct? Then it can load the repo and answer the question whenever a POST /ask/
is made.
My approach is:
.metadata.json
file and /indexes
folder. .metadata.json
will store the id
, status
and path
to each repo. /indexes
will contain an individual .pickle
file for each repo's content.POST /scrape/
will create a new entry into .metadata.json
, start the download
task asynchronously, and return the id. The download
task gets the repo contents, creates and saves index file from it, then updates the status
and path
in .metadata.json
.GET /status/{repo_id}
just returns the status
from .metadata.json
.POST /ask/
looks up the path
from .metadata.json
, passes the .pickle
file to the LLM to answer the question.Let me know how this sounds.
overall your approach sounds good, just on suggestion:
and what do you want to store in the pickle file?
@edublancas okay that makes sense, I'll try out the sqlite method.
The pickle file stores the VectorStoreIndex
that is made from a repo, so there will be an index_repo-id.pickle
file for each repo that is parsed. This way we can just load the VectorStoreIndex
and use it to answer questions faster than having to create a new one every time a user asks a question. btw I didn't decide on this, this is just how they did it in the original chat-with-github example.
try swapping the VectorStoreIndex with lanceDB, it'll allow you to persist it using their format instead of pickle: https://docs.llamaindex.ai/en/stable/examples/vector_stores/LanceDBIndexDemo.html
if it doesn't work, pickle is ok
@bryannho let's now build a frontend. let's use Solara this time, I think you built the arxiv chat with solara right? you can use the same code for the chat
@edublancas question on the frontend design:
In the original Panel app, the user enters in the owner, repo, and branch info via a form on the side panel. The app then loads and repo and the user can use the chat interface on the main panel. For reference:
Should we replicate this design? Or have it purely a chat interface as we did with Arxiv Chat?
If we use the Panel design, it makes the logic to load repos a little more simple. The original app only allows for loading one repo at once but the new one will allow multiple. Either way, I'll need to use OpenAI function calling to discern which repo the user is asking about - but if we use a pure chat interface, I'll also need the LLM to decide if the user is asking to load a new repo.
yeah let's keep the panel design (user selects which repo to build), sounds like that's simpler
we want to build an application similar to this one but using FastAPI (we'll only build the API for now, we'll tackle the frontend later)
there's a few endpoints that this app needs:
and endpoint to parse the contents of a repo
POST /scrape/
should scrape the data from a github repo (should takerepo
in the body request. egploomber/jupysql
) and return a repo_idGET /status/{repo_id}
should return the status of a repo_id:pending
,finished
to check whether scraping has finishedPOST /ask/
should contain aquestion
andrepo_id
in the body, should return the answer to the question using that repositorysince scraping will take a minute or so, we need to implement a task queue to run jobs in the background, I think celery is the simplest option
important: everything has to be prepared in a single dockerfile