quantization-efficient-network Search Results

zchen0420/nn_papers #3

LoRA

# LoRA: Low-Rank Adaptation of Large Language Models 基于large pre-trained model，把基于某个任务的微调存储在低秩矩阵对中，low intrinsic dimension $r=4$ 就够。 Pro: - 并行化不影响速度、任务特化的信息相对很少。 - 该方法对超参数极其不敏感。另外： - 对于模型…

zchen0420 updated 2 weeks ago

secretflow/spu #738

[Question]: About the support for efficient Winograd convolu…

### Issue Type Build/Install ### Modules Involved MPC protocol ### Have you reproduced the bug with SPU HEAD? Yes ### Have you searched existing issues? Yes ### SPU Version spu 0.9.0.dev20240…

warpoons updated 1 week ago

nengo/nengo #1694

Can nengo support the negative spikes in hardware?

Hello!! I'm excited when I meet the nengo project!! I want to simulate my neuron model in nengo_loihi or nengo_FPGA. However, my neuron model can fire negative spike. I know the nengo support negative…

xianyi11 updated 1 month ago

vllm-project/vllm #5793

[Bug]: Different quality responses using GPTQ / marlin kerne…

### 🐛 Describe the bug Hello, I am running llama3-70b and mixtral with VLLM on a bunch of different kinds of machines. I encountered wildly different quality performance on A10 GPUs vs A100/H…

joe-schwartz-certara updated 2 days ago

ultralytics/ultralytics #14053

Difference in performance between YoloV8 and YoloV9

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…

aayushb-95 updated 12 hours ago

Xilinx/brevitas #297

Post-training quantization references

Papers: - Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization https://arxiv.org/abs/1902.01917 - Up or Down? Adaptive Rounding for Post-Training …

volcacius updated 1 year ago

neuralmagic/sparseml #2320

Remove QuantizeLinear/DequantizeLinear of ONNX model

Hi, I trained YOLOv8 model and exported the model to ONNX format by the quantization_recipe below, I set weight_bits=8 and activation_bits=8 to ensure the full-flow inference of quantized model is …

hoangtv2000 updated 2 days ago

facebookresearch/pytorchvideo #87

Does module in "accelerator" support Quantization Aware Trai…

The [toturial](https://pytorchvideo.org/docs/tutorial_accelerator_build_your_model) shows how to build an efficient network with modules provided by "pytorchvideo.layers.accelerator" and how to conver…

duduheihei updated 2 years ago

ultralytics/ultralytics #13976

How to achieve real time (>= 25 fps) object detection in a v…

Ahelsamahy updated 15 hours ago

larq/compute-engine #806

Cannot save compressed binary or ternary weights, saved as f…

I am trying to save a quantized ternary model to a `.tflite` file, but larq doesn't seem to save the weights using datatypes with a reduced precision and thus compress the file size. However, after c…

BenCrulis updated 2 weeks ago

335 results for quantization-efficient-network

335 results
for quantization-efficient-network