network-quantization Search Results

Dooders/Experiments #8

Experiment: Implement Distillation, Quantization, and Crosso…

We aim to implement a system that leverages distillation and quantization to create a "child" neural network by combining parameters from two "parent" neural networks. The child network should inherit…

csmangum updated 2 weeks ago

quic/aimet #3438

Asking for a guide on quantization process utilizing SNPE af…

Hello authors, Thank you for your excellent work. I've tried utilizing AIMET to resolve a severe performance degradation issue caused by quantization while using the SNPE library. However, I've …

chewry updated 1 week ago

NVIDIA/TensorRT #4095

INT8EntropyCalibrator2 implicit quantization superseded by e…

## Description Hi, I have been using the INT8 Entropy Calibrator 2 for INT8 quantization in Python and it’s been working well (TensorRT 10.0.1). The example of how I use the INT8 Entropy Calibra…

adaber updated 1 week ago

leejet/stable-diffusion.cpp #470

Supporting for Ternary DiT

Hi, Ternary quantization has become popular and has demonstrated computational speedups and power reductions, as demonstrated in works like llama.cpp and [bitnet.cpp](https://github.com/microsoft/B…

Lucky-Lance updated 2 days ago

NVIDIA/TensorRT-Model-Optimizer #108

[RFC] TensorRT Model Optimizer - Product Roadmap

# TensorRT Model Optimizer - Product Roadmap [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) (ModelOpt)’s north star is to be the best-in-class model optimization toolki…

hchings updated 19 hours ago

microsoft/onnxruntime #9938

Question about quantization of batch normalization

Hello, Onnx runtime development team. Let us ask the question about quantization of batch normalization. We use onnx runtime 1.9.0, static quantization. If we use "Network A", Batch Normalizatio…

Kentaro-Mikami updated 3 months ago

pytorch/xla #8373

int8 StableHLO export

## 🐛 Bug I'm looking at generating a int8 quantised PyTorch model (both weights and activations at int8), and exporting to StableHLO via `torch-xla`'s `exported_program_to_stablehlo`. Right no…

Wheest updated 5 days ago

opensearch-project/neural-search #991

[FEATURE] Quantization processor in ingest pipeline

### Is your feature request related to a problem? After documents are ingested by **text_embedding** processor, an array of float32 type per **knn_vector** field is stored in segments.(hnsw or ivf) …

YeonghyeonKO updated 5 days ago

holochain/holochain #3315

[Epic] Network Info Wish List

This is a issue to collect what additional "network health" related data we would want to have available in the app api. There is currently some network data exposed to the app api via the [network…

mattyg updated 4 days ago

pytorch/ao #658

Self compressing neural networks

Self-Compressing Neural Networks is dynamic quantization-aware training that puts the size of the model in the loss Paper: https://arxiv.org/pdf/2301.13142 Code: https://github.com/geohot/ai-noteb…

msaroufim updated 2 weeks ago

1000+ results for network-quantization

1000+ results
for network-quantization