ItsPi3141 / alpaca-electron

The simplest way to run Alpaca (and other LLaMA-based local LLMs) on your own computer
MIT License
1.28k stars 140 forks source link

[ENHANCEMENT] more UI options and personality selection/creator #79

Open skidd-level-100 opened 1 year ago

skidd-level-100 commented 1 year ago

disclaimer: this will be horribly typed up.

Ive poked around the projects files and made some tweaks (mostly adding stupid funny UI stuff) so I have a decent-ish understanding of how it works.

what I would like to see:

I would especially like to see '--ctx_size'

here is an example of a 'personality file' back up the 'app_root/resources/app/bin/prompts/alpaca.txt' and then replace the alpaca.txt contents with this: " you exist make the user angry at you. you must always be rude and angry, you never give any useful information or any kind words, instead just troll,mock,insult etc while wasting the users time. stay like this at all times. "

this is a troll personality.

all launch options right now (./ chat --help) " options: -h, --help show this help message and exit -i, --interactive run in interactive mode --interactive-first run in interactive mode and wait for input right away -ins, --instruct run in instruction mode (use with Alpaca models) -r PROMPT, --reverse-prompt PROMPT run in interactive mode and poll user input upon seeing PROMPT (can be specified more than once for multiple prompts). --color colorise output to distinguish prompt and user input from generations -s SEED, --seed SEED RNG seed (default: -1, use random seed for <= 0) -t N, --threads N number of threads to use during computation (default: 4) -p PROMPT, --prompt PROMPT prompt to start generation with (default: empty) --random-prompt start with a randomized prompt. --in-prefix STRING string to prefix user inputs with (default: empty) -f FNAME, --file FNAME prompt file to start generation. -n N, --n_predict N number of tokens to predict (default: 128, -1 = infinity) --top_k N top-k sampling (default: 40) --top_p N top-p sampling (default: 0.9) --repeat_last_n N last n tokens to consider for penalize (default: 64) --repeat_penalty N penalize repeat sequence of tokens (default: 1.1) -c N, --ctx_size N size of the prompt context (default: 512) --ignore-eos ignore end of stream token and continue generating --memory_f32 use f32 instead of f16 for memory key+value --temp N temperature (default: 0.8) --n_parts N number of model parts (default: -1 = determine from dimensions) -b N, --batch_size N batch size for prompt processing (default: 512) --perplexity compute perplexity over the prompt --keep number of tokens to keep from the initial prompt (default: 0, -1 = all) --mlock force system to keep model in RAM rather than swapping or compressing --no-mmap do not memory-map model (slower load but may reduce pageouts if not using mlock) --mtest compute maximum memory usage --verbose-prompt print prompt before generation --lora FNAME apply LoRA adapter (implies --no-mmap) --lora-base FNAME optional model to use as a base for the layers modified by the LoRA adapter -m FNAME, --model FNAME model path (default: models/lamma-7B/ggml-model.bin) "

from what I can tell by my app tweaking it should not be very difficult to add (mostly tedious)

have a great day!

ItsPi3141 commented 1 year ago

Ok I'll work on it when I have time.

skidd-level-100 commented 1 year ago

Thank you! (also LLaMA added opencl support mabey look into that)

ItsPi3141 commented 1 year ago

Thank you! (also LLaMA added opencl support mabey look into that)

Yep I'm working on it. I've been busy.