Closed walking-octopus closed 8 months ago
I like this as an ability, but I think it's something that should live in an additional tool rather than being baked into Symbex itself.
I actually use Symbex along with my LLM embedding tool for semantic search - building a search index that's populated using Symbex. I wrote more about how I do that in the release notes here: https://github.com/simonw/symbex/releases/tag/1.4
There are models available like CodeBERT and if efficient CPU inference is a target and you don't mind some quirks, bert.cpp may be an excellent GGML based inference framework.
I think this can enable queries based on intent or approximate names. A symbol like
get_status
may be fuzzy matched tofetchSystemState
or maybe even a natural language query like "how is the CSV parsing implemented?" would yield some good enough results (possibly allowing a Bash script with some React pattern, where the LLM uses a code-search tool to fetch context).