Code-Bullet / RickAndMortai

a chatgpt thingy with rick and morty characters and portals and theres shrek idk.
85 stars 13 forks source link

Local AI Text Generation Managers #1

Closed steven4547466 closed 1 year ago

steven4547466 commented 1 year ago

The main point of these commits is to add the BoogaAIController and the ProtocolAIController. Part of this change comes with a small refactor, like moving the system message to a text file so that you can give different controllers different system messages, or switch between system messages for testing easily.

The booga controller is for the oobabooga text generation webui. It interfaces with the api that it has.

The protocol controller is a generic controller used to allow developers to make their own text generation server that will work as long as it follows the protocol.

The other parts of this commit are mainly cleanup. Like commenting out the ZeroMQ youtube chat manager as it is no longer used, and removing the related packages, as they bloat the workspace and were giving me errors. Don't worry about the "Fix a couple infinite loops" and its revert, that was a mistake on my part

Directions for oobabooga:

  1. First, drag and drop the Booga Manager into the Whole Thing manager's AIController field: mc4elp_17365 1
  2. Set the model name by selecting the Booga Manager under Managers and in the properties tab, see Model. You should set it to the name of a model in your models/ tab, for example: airoboros-l2-13b-2.2.Q5_K_M.gguf
  3. Edit your model parameters as necessary a. I recommend keeping Threads at 0, which is automatic b. N Batch can be left at 512 for most systems, but may need to be lowered on low end systems c. N GPU Layers should be set to the amount of layers you want to offload to your GPU, this massively decreases generation time, but you will need a compatible, and strong, GPU and llamacpp must be compiled with cuBLAS (https://github.com/ggerganov/llama.cpp#cublas) d. Increasing N Context allows the api to understand more, but as long as the system message stays moderately low in token count, there's not really much point to increasing it. This may need to be decreased on low end systems.
  4. Generation parameters a. I don't even know what most of these do, you should mess around with these until you're getting prompts that you like

Directions for protocol (MOST PEOPLE WILL NEVER USE THIS, BUT THE OPTION IS THERE, REQUIRES DEVELOPMENT EXPERIENCE): I'm not going to be explaining how to make the server, but I am going to explain the protocol you must follow.

This project will send an http POST request along the specified port (default 9998) with the following JSON structure:

[
  {
    "AuthorRole": 0,
    "Text": "SYSTEM MESSAGE"
  },
  {
    "AuthorRole": 1,
    "Text": "USER (input) MESSAGE"
  }
]

There might be more entries, just know that AuthorRole=0 is a system message, and AuthorRole=1 is a user message.

it is your job to respond to this request with a simple string. No JSON, no XML, just a simple string, which is the output, and a 200 status code.

With this anyone can also make their own controller by extending the AIController class, which allows for more possibilities in the future.

lolerwaffles commented 1 year ago

@steven4547466 can you add a note about possibly needed to compile with CuBLAS support to the directions for GPU layers.

steven4547466 commented 1 year ago

@steven4547466 can you add a note about possibly needed to compile with CuBLAS support to the directions for GPU layers.

Sure

Zyther commented 1 year ago

This is a giant PR, but it cleans up the base project a lot, and implements local AI alongside openAI. This is great, and, imo, should be considered "base"