auto-quant Search Results

1000+ results
for auto-quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

QwenLM/Qwen2.5 #240

相比qwen第一版，显存占用为什么增加了很多？

通过缩小context window和pos embedding大小，好像都没有用。是什么导致显存占用增加了，相比qwen第一版显存占用增加了很多

charliedream1 updated 1 month ago
10
PaddlePaddle/Paddle #54785

ModuleNotFoundError: No module named 'paddle.nn.quant.format…

### bug描述 Describe the Bug platform: windows11 paddlepaddle-gpu: 2.4.2 paddledet: 2.6.0 paddleslim: setup安装 example/auto_compression/detection/run.py报错如下: ModuleNotFoundError: No module named …

Victoria-1 updated 5 months ago
2
huggingface/accelerate #2482

RuntimeError: Expected all tensors to be on the same device …

Couldn't find any similar other issues in `accelerate`, `peft`, or `trl` so I'm opening one here. When using the DPOTrainer on a single GPU with QLoRA I have no issues, but when I try to run the scrip…

nnethercott updated 2 months ago
4
LoupHC/controleur-CAPE #90

Assemblage du boitier

J'ai eu un flash en regardant ce modèle spécifique du [iGrow](https://www.greenhousemegastore.com/equip/controls-measuring-tools/environmental-controls/igrow-1400-greenhouse-controller) Le circuit es…

LoupHC updated 5 years ago
27
huggingface/trl #1264

How to train the model and ref_model on multiple GPUs with a…

For example,I have two RTX 3090 GPUs, and both the model and ref_model are 14 billion parameter models. I need to distribute these two models evenly across the two cards for training. this is my code…

Minami-su updated 1 month ago
2
huggingface/trl #1262

RuntimeError: Expected all tensors to be on the same device,…

``` Traceback (most recent call last):aded File "Sakura_DPO.py", line 318, in fire.Fire(train) File "/root/miniconda3/lib/python3.8/site-packages/fire/core.py", line 141, in Fire com…

Minami-su updated 1 month ago
2
vdemichev/DiaNN #110

No MS2 spectra: aborting ERROR: cannot load the file, skippi…

I got the following when running Dia-NN on a set of six Thermo raw files: ``` DIA-NN 1.7.12 (Data Independent Acquisition by Neural Networks) Compiled on Oct 1 2020 21:36:38 Current date and ti…

tobiasko updated 1 month ago
9
ModelCloud/GPTQModel #124

[COMPAT] vLLM does not support quantized MoE except for Mixt…

Using vllm to infer the deepseek model encountered an error ``` [rank0]: self.mlp = DeepseekV2MoE(config=config, quant_config=quant_config) [rank0]: File "/home/root/.local/lib/python3.10/s…

Xu-Chen updated 2 months ago
14
huggingface/transformers #25712

NameError: name 'torch' is not defined

### System Info `transformers` version: 4.32.0 - Platform: Linux-5.19.0-38-generic-x86_64-with-glibc2.35 - Python version: 3.10.9 - Huggingface_hub version: 0.16.4 - Safetensors version: 0.3.1 …

pseudotensor updated 3 months ago
9
huggingface/optimum-benchmark #214

Strange latency increasing on the quantized model

Hello! I am testing the mistral-7b inference after quantization. I also want to test the impact of Flash Attention (sdpa, eager, fa2) on model inference. But the model decode latency is too high, and…

MartaSamoilenkoPn updated 4 months ago
4

上一页 1...80 81 82 83 84 85 86...100 下一页

1000+ results for auto-quant

1000+ results
for auto-quant