int8-quantization Search Results

1000+ results
for int8-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

cpldcpu/BitNetMCU #4

4.6 quantization scheme

If you have plans to develop this project further, I would like to suggest a 4.6-bit scheme. https://www.mdpi.com/2227-7390/12/5/651 I think this is an interesting schematic that fits very well on a…

kimstik updated 3 months ago
7
iree-org/iree #14337

[LLVMCPU] Data tiling of group quantized matmul on CPU

The following is an example of a group quantized matmul found in Vicuna (pulled from https://github.com/nod-ai/SHARK/issues/1630, closely related to the i4 IR attached in #12859). ``` #map = affine_…

qedawkins updated 1 year ago
36
vllm-project/vllm #5001

[Bug]: 0.4.2 error on H20

### Your current environment ```text The output of `python collect_env.py` ``` root@9b33a89c3857:/workspace/vllm-0.4.2# python collect_env.py Collecting environment information... PyTorch versi…

tohneecao updated 2 months ago
13
PKU-YuanGroup/ChatLaw #45

13b的模型跑起来，需要多少显存资源

13b的模型跑起来，需要多少显存资源

Jonsun-N updated 8 months ago
12
TGSAI/mdio-python #259

Loading speed

Hey guys Nice work with SEG-Y loader! At our team, we use our own library to interact with SEG-Y data, so I've decided to give a try to MDIO and compare the results of multiple approaches and libra…

SergeyTsimfer updated 1 year ago
6
pytorch/ao #1076

torchao already works on raspberry pi

## Problem We don't publish aarch64 linux binaries so right now we still install ao=0.1 ``` (myvenv) marksaroufim@rpi5:~/Dev/ao $ pip install torchao Looking in indexes: https://pypi.org/simpl…

msaroufim updated 1 month ago
2
ultralytics/ultralytics #13314

How to Convert YOLOv10 Model to TFLite with INT8 Quantizatio…

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…

AhmedFkih updated 4 weeks ago
5
vllm-project/vllm #10109

[Usage]: [XPU] offline_inference.py - RuntimeError: oneCCL: …

### Your current environment ```text Collecting environment information... [WARNING] Failed to create Level Zero tracer: 2013265921 WARNING 11-07 06:41:13 _logger.py:68] Failed to import from vllm…

rskasturi updated 6 days ago
2
OpenMOSS/MOSS #107

加载量化模型时报错

Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a `revision` is encouraged…

hhllxx1121 updated 1 year ago
2
abetlen/llama-cpp-python #1645

All requests end with 'finish_reason': 'length' when the max…

All requests end with 'finish_reason': 'length' when the max_tokens=-1 parameter is set. What could be the problem? **Model**: https://huggingface.co/IlyaGusev/saiga_mistral_7b_gguf/resolve/main/…

tur0kmagalp updated 2 months ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for int8-quantization

1000+ results
for int8-quantization