TechNickAI / AICodeBot

AI-powered tool for developers, simplifying coding tasks and improving workflow efficiency. 🤖
GNU Affero General Public License v3.0
50 stars 12 forks source link

WIP: local LLM prototype (would like feedback) #70

Closed hanselke closed 3 months ago

hanselke commented 1 year ago

added docker-compose which launches nats + falcon7b.

it's currently pretty hacky, as nats requires async.

  1. Tried to use async LLM _call functions and agenerate but it gives a weird error.
  2. Used asyncclick instead to have async cli functions.
  3. ignored "INP001", # TODO dont know how to deal with services/falcon7b/nats_falcon7b.pyis part of an implicit namespace package. Add aninit.py`. ]

plus falcon7b doesnt work with current prompts. it returns pretty much the full prompt for now.

TODOs pending approach review:

  1. clean up nats server routing so it can be run outside of the docker network
  2. universal-ish transformers pipeline for different models? Prob have a few categories like bfloat16, various quantitation options. maybe find a way to rip off https://github.com/go-skynet/LocalAI

Bottlenecks:

  1. need a mechanism to have different prompts for different LLMs
  2. need to learn how to use pytest properly, and ideally have a code generation test so we can actually compare the different LLMs. Need to close the loop for generated code to run against unit tests.
TechNickAI commented 1 year ago

Working on this now

TechNickAI commented 1 year ago

I got basic Hugging Face Hub "working" (with bad results)

https://github.com/gorillamania/AICodeBot/commit/cb604ca970146d72d4dc836ba8a6888528a61c6c

But I think what we actually need is local LLMs, and the direction I'm going is this docker image from hugging face that runs highly optimized local models.

https://github.com/huggingface/text-generation-inference#using-a-private-or-gated-model

hanselke commented 1 year ago

cool. i didnt know that you could self-host models thru huggingfacehub.

so from my research, it seems like we're gona need >20B params for it to be of any use.

I think the best way really is to close the loop by running the output code against unit tests. Then we'll be able to just run it thru all the models out there. Probably wont be straight forward due to prompt differences, but I feel dumb testing them manually one by one, knowing that theres gona be more of them released over time.

TechNickAI commented 3 months ago

Thank you for your contribution. The code has long since diverged from this approach.