How to load leaked LLaMA weights?

You are looking at the wrong files. Alpaca-Turbo is a web UI for the "main.exe" aka "chat.exe" binary which is the llama.cpp program for using models on CPU - so, basically long story short you want the 4bit quantized files which you can find all over on the huggingface website. Those files aren't here in most cases because the filesize is large, but also that all the ones derivative from the llama leak are legally problematic (meta owns the code and so it can't be leaked without risk of DMCA issue). You will have a pretty decent success rate by finding large files where it's a single file that ends with the ".bin" extension, and better luck where the file has "q1" or "4bit" in it. For optimal results, you could go past these hints to actually knowing what you're doing (which for the most part I don't either). Anyway new tools just dropped while I was typing this, easier to use and bundled with latest models... so not trying to be a dick and cross-promote in this guy's github but all you have to do is be in youtube or tiktok and express interest in ai and you'll be bombarded with it, everywhere, so just look around. New easy-to-use one click tool for this just dropped.

ViperX7 / Alpaca-Turbo

How to load leaked LLaMA weights? #55