sparse-quantized-models Search Results

363 results
for sparse-quantized-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Samsung/ONE #8747

Let's find reference model for RNN support

For milestone in https://github.com/Samsung/ONE/projects/9#card-79474017 ## Candidate 1 [ one-cmds pythorch (or ONNX) LSTM op import fails · Issue #8217](https://github.com/Samsung/ONE/issues/82…

chunseoklee updated 2 years ago
24
pytorch/torchrec #996

Wrong output when applying quantize_embeddings with nccl bac…

I'm testing the quantization training and inference of torchrec and I found the quantized model sometimes has the wrong output with certain world sizes. I compare the outputs of a distributed model an…

jiannanWang updated 1 year ago
5
casper-hansen/AutoAWQ #456

Support of llava-v1.5 and llava-v1.6 with transformers==4.40…

I try to run llava-v1.6-34b-hf-awq and sucessed, but how can I run the test for Llava-v1.5 ConditionalGeneration? https://github.com/casper-hansen/AutoAWQ/pull/250 The bug of example likely : 1. ma…

WanBenLe updated 4 months ago
9
microsoft/DeepSpeed #4852

[BUG] ZeRO++ is broken: `zero_quantized_weights` fails

**Describe the bug** Adding `"zero_quantized_weights": true,` leads to a crash: ``` 35:1]: warnings.warn( [35:1]:Traceback (most recent call last): [35:1]: File "/data/env/lib/repos/retro-l…

stas00 updated 6 days ago
7
turboderp/exllamav2 #512

Support for architecture DeepseekV2ForCausalLM

Hi! When trying to quantize the new Deepseel Coder V2 https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct I got the following error: ``` !! Warning, unknown architecture: DeepseekV2F…

RodriMora updated 4 months ago
10
llm-attacks/llm-attacks #24

Possible to use quantized models? 4bit, 8bit, etc

Do you only use chat formatting from fastchat? Or also inference? Fastchat already supports GPTQ. https://github.com/lm-sys/FastChat/blob/main/docs/gptq.md My other idea was to edit loading parame…

Ph0rk0z updated 1 year ago
19
airockchip/rknn_model_zoo #15

导出.torchscript成功，转换rknn失败

根据 [airockchip](https://github.com/airockchip)/[yolov5](https://github.com/airockchip/yolov5) 使用 ```python python export.py --rknpu rk3399pro ``` 命令成功导出了 yolov5s.torchscript ，然后在rknn-toolkit-1…

zhaotun updated 1 year ago
2
pytorch/pytorch #69364

[feature request] `quantized::linear_dynamic` on CUDA/eager,…

It's useful to be able to run a quantized transformer exported TorchScript model on CUDA even if some quantized operators are performed by dequant/conversion to float32/requant (some sort of temporary…

vadimkantorov updated 5 months ago
63
vllm-project/llm-compressor #72

Saving an existing 2:4 model as compressed doesn't produce a…

We have an existing model [neuralmagic/SparseLLama-2-7b-ultrachat_200k-pruned_50.2of4](https://huggingface.co/neuralmagic/SparseLLama-2-7b-ultrachat_200k-pruned_50.2of4) that has already been pruned t…

mgoin updated 1 week ago
2
microsoft/DeepSpeed #5195

[BUG] ValueError: `.to` is not supported for `4-bit` or `8-b…

**Describe the bug** Loading the llama2 70b model using 4 bit(bitstandbytes) and then distributed the model by calling deepspeed.initialize. Get the following error ``` ------------------------…

robinsonmhj updated 2 months ago
1

上一页 1...3 4 5 6 7 8 9...37 下一页

363 results for sparse-quantized-models

363 results
for sparse-quantized-models