auto-quant Search Results

1000+ results
for auto-quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

sgl-project/sglang #302

How to launch server with quantized model?

Sorry for a newb question, I don't find an answer. I succeeded in launching the server with unquantised Mistral7B: ``` python3 -m sglang.launch_server --model-path mistralai/Mistral-7B-Instruct-v0.2…

Gintasz updated 2 months ago
9
OpenGVLab/InternVL #129

Using Swift to perform inference and fine-tune InternVL-Chat…

Thanks for your awesome work. [swift](https://github.com/modelscope/swift) now supports inference, training of InternVL-Chat-V1.5 model For more information, please refer to our document - [Eng…

hjh0119 updated 1 month ago
28
huggingface/peft #636

How to save full model weights and not just the adapters ?

### System Info peft==0.4.0.dev0 I'm not sure if this should be a bug report, so sorry if this is not convenient. According to the `save_pretrained`method docstring, this saves the adapter model…

azayz updated 5 months ago
11
huggingface/transformers #30452

ValueError: Unrecognized configuration class <class 'transfo…

### System Info Cuda : 12.1 OS : Windows x64 pip : 24.0 python : 3.10.10 transformers : 4.40.0 bitsandbytes: 0.43.1 ### Who can help? Hey, there @younesbelkada , @amyeroberts I am getting…

KaifAhmad1 updated 5 months ago
4
InternLM/lmdeploy #2004

[Bug] 两张 V100 部署 InternVL2-26B，多模态对话时无应答

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. ### Describe the bug 按照 huggingface 的 README 启动服务：…

zhanghx0905 updated 2 months ago
16
huggingface/transformers #29049

Getting Long text generation after fine tuning Mistral 7b Mo…

### System Info Hi, I am fine tuning Mistral7b model. I am getting long automated text generation using the fine tuned model. I have kept the eos_token=True. Can someone please tell me how to add a…

Rishita32 updated 6 months ago
2
turboderp/exllamav2 #390

dbrx doesn't respect gpu_split, OOMs on the first GPU no mat…

Using the latest commit d3184ec, I was able to make my own 4bpw quant of dbrx-instruct. I am running into problems trying to load the model in text-generation-webui (using that same commit of exllamav…

tdrussell updated 6 months ago
3
InternLM/lmdeploy #1142

4bit量化时ptb_text_only在连接huggingface时无法下载[Bug]

### Checklist - [ ] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest version. ### Describe the bug ![e341993d05015208d204…

xxg98 updated 4 months ago
5
deepset-ai/haystack #7277

Problem when using pipeline serialization

Hello, I woul like to use pipeline serialization. Below is my code for the llm ``` llm: init_parameters: huggingface_pipeline_kwargs: model: mistralai/Mixtral-8x7B-Instru…

Asma-droid updated 6 months ago
12
huggingface/peft #510

Incorrect weight shape for Linear4bit Lora

I tried to apply 4-bit training on Lora for EsmModel. However, there has been an error specifically for 4-bit training. The error disappears perfectly once `load_in_4bit=True` is commented out. Cod…

SimonKitSangChu updated 5 months ago
12

上一页 1...90 91 92 93 94 95 96...100 下一页

1000+ results for auto-quant

1000+ results
for auto-quant