support chatml-function-calling via llama-cpp

dnakov commented 9 months ago

[x] support chatml-function-calling via llama-cpp

trufae commented 9 months ago

i would prefer to merge smaller changes rather than having so many items in this PR. also you wont need to be merging and rebasing that frequently.

I am working on the markdown fix because the current interpreter.py have some legacy code that is broken and its not really working well with some models right now. I hope to get this done hopefully today

trufae commented 9 months ago

See the EPIC ticket :)

dnakov commented 9 months ago

sounds good, i'll split them up!

trufae commented 9 months ago

So it's ok to merge now?

dnakov commented 9 months ago

not yet, i haven't done any regression testing yet, ill let you know soon

trufae commented 9 months ago

I want to merge this https://github.com/radareorg/r2ai/pull/6 as it's actually fixing the behaviour of code colorization and ive tested it with several models. the code is still not fully cleaned but its much simpler and at least it works as expected.

is it ok for you for me to merge it?

dnakov commented 9 months ago

Yeah go ahead, I'll deal with any merge conflicts when this is ready

dnakov commented 9 months ago

ok this works now, merged with your changes. Not getting any good results with mistral, but that's more of a model problem. Hopefully functionary will be a lot better

trufae commented 9 months ago

Ready to merge to start experimenting with it? I think enabling vectordb is important for the auto mode, as well as extending the doc/data stuff to give more contextual hints about how to achieve things and which commands use

trufae commented 9 months ago

uh "This branch cannot be rebased due to conflicts"

trufae commented 9 months ago

conflicts are too large to be resolved via github, please fix them and force push that.

i would recommend you for the next PR to use a separate branch (not named master) and use git rebase instead of git merge

trufae commented 9 months ago

Also, note that commit messages must be capitalized, and i would recommend you to squash the commits

dnakov commented 9 months ago

ok should be good now

dnakov commented 9 months ago

Ready to merge to start experimenting with it? I think enabling vectordb is important for the auto mode, as well as extending the doc/data stuff to give more contextual hints about how to achieve things and which commands use

yeah, to the extent of if you -m TheBloke/Mistral-7B-Instruct-v0.2-GGUF then prompt it with ', it'll use that model and it will call some function, but chatml-function-calling and the models are not "smart" and reliable enough to know when to stop calling functions and send a message. This can probably be offset a bit with different prompting + RAG, but my hope is for functionary.

How do you envision structuring the RAG data? What's the best "use case -> commands" documentation/reference that we can shove into the vectordb? If we just put the r2 docs in there, i don't think we'd get the any good results for high level queries like "solve this crackme" or "what's the password"

trufae commented 9 months ago

We may probably update to the latest mistral models in the -M output (i filled a ticket for this). ive tested your code and it's not really working well. not sure if there's a way to debug what the model is doing internally to trace what's going on.. maybe via -e debug=true ?

About vectordb, what it does is to send the user prompt to the database and the database returns a list of sentences that can be prepended to the query for contextual information. this data can provide instructions to perform actions or information about the answer the user is looking for. so its transparent to the user and works across all models. i think integrating this in the auto mode will help a lot in the local results too.

dnakov commented 9 months ago

Yeah ive gotten it to work only like once by luck. I'll add some debugging.

About the vectors, yeah, I know how RAG works, I mean what text are you thinking of putting in there, do you have examples?

radareorg / r2ai

support chatml-function-calling via llama-cpp #4