-
### Describe the issue
I haved a pre-trained CNN model of tensorflow saved model and I convert it to **.onnx form** as well as a **static quantized .onnx form**, and their inference latency at the…
vonJJ updated
4 months ago
-
Status: Draft
Updated: 09/18/2024
# Objective
In this doc we’ll talk about how different optimization techniques are structured in torchao and how to contribute to torchao.
# torchao Stack Ove…
-
### 📚 The doc issue
How to run the example execution_runner .exe after building it using cmake from the tutorial https://pytorch.org/executorch/stable/getting-started-setup.html, with multiple threa…
-
I'd like to use this as a drop-in for `wick` and just wanted to check a couple of things. Your paper states that
> Similarly, it would be desirable to expand WICK&D to mixed fermionic/bosonic fiel…
-
## 🚀 Feature
Currently (as of 1.8.1) torch.nn.quantized.functional.conv1d/2d/3d/linear require the output to always be requantized to 8b. Conv operators ask for an output scale and zero point, while …
-
I push SD performance to the maximum. Currently I can generate 200 images per second on my 4090 when using 1 step sd-turbo, the onediff compiler, the stable-fast compiler, and my own optimizations. …
-
`Tensor` objects (or more generally: anything fulfilling the `AbstractTensor` concept) are expected to divide their indices into bra and ket indices. There seems to be a connection (at least notation-…
-
### Supported
- FullyConnected
### Not yet
- Conv
- DepthwiseConv
- BatchMatMul
- LSTM
- RNN
-
### What
Let's support int8 quantization in circle-quantizer.
### Why
Onert-micro support int8 quantized kernels and contains faster CMSIS-NN kernel, which works with int8 quantization, not …
-
If I deploy on the Qualcomm HTP and want the bias quantization to be 32 bits, but I can't find a parameter in Aimet's configuration file to set the bias quantization bit width. There are only settings…