auto-quant Search Results

1000+ results
for auto-quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

meta-llama/llama #540

4 Bit Inference of LLaMA-2-70B

Has anyone been able to get the LLaMA-2 70B model to run inference in 4-bit quantization using HuggingFace? Here are some variations of code that I've tried based on various guides: ```python3 nam…

ijoffe updated 5 months ago
21
bitsandbytes-foundation/bitsandbytes #1252

LLama3-8B - FSDP + QLORA results in OOM with 4 A40's

**Hardware**: CPU: Xeon® E5-2630 v2 but limited to 16GB as this is what the vast.ai instance has. GPU: 4x A40 --> Total of 180GB **OS** Linux **python** 3.10 **cuda** 12.2 **packa…

MikaSie updated 5 months ago
1
huggingface/transformers #27397

SafetensorError: Error while deserializing header: InvalidHe…

### System Info Hi guys, i just fine tune alpaca (LLaMA 7B base model) with custom dataset and using trainer API. After completing the training process, I received the following error: ```python…

adhiiisetiawan updated 3 months ago
11
stanfordnlp/dspy #495

How to use any quantized huggingface transformers model

Hey, I'm trying to use a quantized model due to memory issue. We usually load the model like this, ``` quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dty…

ujjawal-ti updated 6 months ago
6
servo/servo #21958

undefined reference to 'gst_player_media_info_get_video_stre…

OS: Debian 9 amd64 ``` /build/servo/target/release/deps/libgstreamer_player-47ed31aa97c38542.rlib(gstreamer_player-47ed31aa97c38542.gstreamer_player.1u6u9zpa-cgu.8.rcgu.o):gstreamer_player.1u6u9zp…

lexesv updated 6 years ago
6
huggingface/trl #1995

CUDA error: device-side assert triggered

### When I run the following script ``` import torch from accelerate import Accelerator, PartialState from peft import LoraConfig from tqdm import tqdm from transformers import AutoTokenizer, …

TolearnMo updated 4 weeks ago
12
OpenBMB/MiniCPM-V #132

本地运行Int4版本报错

本地运行int4版本时候出现报错： (MiniCPMV) yushen@user-MS-7E06:~/ai/MiniCPM-V$ python web_demo_2.5_gy.py --device cuda Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not use…

ysyx2008 updated 5 months ago
2
OpenBMB/MiniCPM-V #90

web_demo_2.5.py 调用 MiniCPM-Llama3-V-2_5-int4 模型报错

python /services/srv/MiniCPM-V/web_demo_2.5.py --device cuda 【已经修改web_demo_2.5.py的模型为MiniCPM-Llama3-V-2_5-int4】 Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs ar…

triumph updated 5 months ago
9
huggingface/transformers #26905

TypeError: Object of type BitsAndBytesConfig is not JSON ser…

### System Info I am running via script inside a Docker running in a Linux environment. ### Who can help? @younesbelkada this issue is similar but not equal to #24137. ### Information - [ ] The o…

andreducfer updated 2 months ago
7
modelscope/ms-swift #1299

Can't inference finetuned `Florence 2` models.

**Describe the bug** Just fine-tuned (full) the `florence-2-large-ft` model and now i can't run them. # Command to reproduce ```txt CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir output/flore…

Aunali321 updated 4 months ago
2

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for auto-quant

1000+ results
for auto-quant