-
A770,Ubuntu系统
```
for n in $(seq 16 16); do
echo "Model= $MODEL RATE= 0.7 N= $n..."
python3 benchmark_vllm_throughput.py \
--backend vllm \
--m…
-
Hi! I was wondering if there is any possibility in Caffe to compress a neural network.
In this paper [Deep compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman…
-
Hello everyone,
First off, a big thanks to city96 for the awesome work they've been contributing to the community. It's been incredibly helpful!
Here are my system specs:
Processor: Intel i5-13…
-
### 1. System information
Colab , as of 2023-10-23
### 2. Code
Please see the attached colab notebook here
https://colab.research.google.com/drive/1yUD0nDu8oeeDtQBa7xCbQWx_w8PxS4UC?usp=sharin…
-
**What would you like to be added/modified**:
A benchmark suite for large language models deployed at the edge using KubeEdge-Ianvs:
1. Interface Design and Usage Guidelines Document;
2. Implem…
-
### Documentation issue/request
0 useful information on Quantization. How do i perform it, what settings should i choose for different Quantization types Q8, Q5, (and what would be the difference i…
-
### Describe the issue
MatMul in ONNX OpSet 13 started to support bf16 (https://onnx.ai/onnx/operators/onnx__MatMul.html)
However, we dont see the implementation for bfloat16 in the CPU EP for …
-
When I try to train VQVAE on my own data, I find the loss for vqvae training is only reconstruction loss https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/fdc786bc8e52d6386fb32c833eba0b4db286ca7b/o…
-
Hi there,
I applied transformix with trained parameters to my labels and got the transformed label with some "quantization noise" which I suspect is from data type conversion to float point.
So is…
-
hi there
according to the documentation
https://github.com/analogdevicesinc/ai8x-training#quantization-aware-training-qat
we can use either QAT or post quantization but can I use both of them? if …