autoround Search Results

51 results
for autoround

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/Olive #1075

Olive workflow for mistral model optimization does not work

**Describe the bug** Following the instructions in [`examples/mistral`](https://github.com/microsoft/Olive/tree/main/examples/mistral) does not result in a quantized onnx model. After running the wor…

jojo1899 updated 5 months ago
17
ModelCloud/GPTQModel #26

[FEATURE] PIP package and robust packing API

1 when will you provide pip package? 2 automatically backend change for each layer, as I know some backends have specific requirement, for example, bits, channel number 3 will you support layer …

wenhuach21 updated 3 months ago
9
dotnet/vblang #282

An update from the Design Safari on the `INotifyPropertyChan…

* Four months ago (November, 2017), Visual Basic MVP Klaus Löffelmann (@KlausLoeffelmann) was invited to attend the VB LDM to present on the challenges of modern GUI programming. Klaus presented a few…

AnthonyDGreen updated 5 years ago
28
pytorch/ao #391

[RFC] torchao Contributor Guide

Status: Draft Updated: 09/18/2024 # Objective In this doc we’ll talk about how different optimization techniques are structured in torchao and how to contribute to torchao. # torchao Stack Ove…

jerryzh168 updated 1 month ago
16
intel/auto-round #88

Quantization/layer speed is very slow

Current testing PR #87 and running into very slow quants for a Tinyllama 1.1B test model. I am geting ~96s per layer in quantization on 4090 gpu with n_blocks = 1 and ~75s per layer with n_blocks…

Qubitium updated 7 months ago
2
intel/auto-round #100

OPT model quantize_lm_head clarification

While testing for OPT with `quant_lm_head=True`, here are the result weights post quantize: `weight keys: ['lm_head.g_idx', 'lm_head.qweight', 'lm_head.qzeros', 'lm_head.scales', 'model.decoder.em…

Qubitium updated 6 months ago
3
intel/neural-compressor #1695

RTN sym behavior is not aligned

There are different sym implementations, one is gptq/autoround way, the other is awq/our-rtn way.

wenhuach21 updated 7 months ago
1
intel/auto-round #103

cohere model support request

MichoChan updated 6 months ago
8
intel/intel-extension-for-transformers #1582

Whether FP4 inference is supported

The README mentions FP4. Does it support FP4 inference? It looks like there is no implementation of FP4 inference in the current code. If it is supported, is there a tutorial for using it(including q…

PhzCode updated 5 months ago
4
intel/auto-round #41

GPU memery use

How much GPU memory and CPU size are required when I quantize the Chatglm3-6B model，I used A100-40G，but get an error ”killed“。

Rain19981998 updated 8 months ago
4

上一页 1...1 2 3 4 5 6...6 下一页

51 results for autoround

51 results
for autoround