-
LoRAs are distributed on Hugging Face as folders containing two files:
```
$ ls kaiokendev_SuperCOT-7b
adapter_config.json
adapter_model.bin
```
How can such LoRA be loaded using the new pef…
-
- [ ] [Qwen-1.5-8x7B : r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/comments/1atw4ud/qwen158x7b/)
# TITLE: Qwen-1.5-8x7B : r/LocalLLaMA
**DESCRIPTION:** "Qwen-1.5-8x7B
New Model
Someone creat…
-
Hi!
Thank you for the paper! It is inspiring that you can compress weights to about 1 bit and the model still works better than random.
A practical sub-2-bit quantization algorithm would be a grea…
-
### Check for existing issues
- [X] Completed
### Describe the bug / provide steps to reproduce it
When I configure OpenAI, I only see a field for API key but no field to enter URL and port. In my …
-
Hi all !
model is working great ! i am trying to use my 8GB 4060TI with MODEL_ID = "TheBloke/vicuna-7B-v1.5-GPTQ"
MODEL_BASENAME = "model.safetensors"
I changed the GPU today, the previous one wa…
-
## Issue Description
Hello GitHub community,
I am currently seeking guidance on how to effectively evaluate the MADLAD 400 model, a 7.2B parameter machine translation (MT) model that has been fi…
-
DRY is a modern repetition penalty that is superior to the standard frequency and presence penalties at preventing repetition, while having virtually none of their negative effects on language quality…
p-e-w updated
3 weeks ago
-
Have you considered using langchain+some truely-open source model?
there's also a JS port of langchan + alpaca: https://github.com/linonetwo/langchain-alpaca
can't recommend any use-case specif…
-
Hi,
Thanks to the great work of the authors of AWQ, maintainers at [TGI](https://github.com/huggingface/text-generation-inference), and the open-source community, AWQ is now supported in TGI ([link…
-
This seems like it'll be the most important task to make this more viable for people.
Alternative models will be cheaper, potentially much faster, allow running on someone's own hardware (LLaMa), a…