Closed phact closed 5 months ago
Thanks @phact! Looks like a minor rebase needed. At first glance I think this is pretty good start, just except a couple notes:
I'm getting this for most files, maybe will need to do something to make OpenAI understand these are text files
Dockerfile: Error code: 501 - {'message': 'Unsupported file type'}
I was thinking about what file types we should add. Maybe the play is to use a block list like sweep does:
Merging, going to structure a bit and make my notes a new issue
Added a new cli command
process-repo-files
which clones the repo, switches to the right commit hash, and iterates [sequentially for now] over all the files, uploading them to assistants-api. This gets us embeddings and RAG for free:I clone to /tmp for now, not sure if that's what we want.
In the end it dumps a list of file_ids that you can give your assistant for searching.
If this is interesting I'm happy to help optimize how assistants chunks and embeds for our purposes.