-
Papers:
- Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization https://arxiv.org/abs/1902.01917
- Up or Down? Adaptive Rounding for Post-Training …
-
# LoRA: Low-Rank Adaptation of Large Language Models
基于large pre-trained model,把基于某个任务的微调存储在低秩矩阵对中,low intrinsic dimension $r=4$ 就够。
Pro:
- 并行化不影响速度、任务特化的信息相对很少。
- 该方法对超参数极其不敏感。
另外:
- 对于模型…
-
Hello!! I'm excited when I meet the nengo project!! I want to simulate my neuron model in nengo_loihi or nengo_FPGA. However, my neuron model can fire negative spike. I know the nengo support negative…
-
The [toturial](https://pytorchvideo.org/docs/tutorial_accelerator_build_your_model) shows how to build an efficient network with modules provided by "pytorchvideo.layers.accelerator" and how to conver…
-
jetson@ubuntu:~$ jetson-containers run $(autotag nano_llm) python3 -m nano_llm.chat --api=mlc --model Efficient-Large-Model/VILA1.5-3b --max-context-len 256 --max-new-tokens 32 --pro…
-
## 논문 2종 리뷰 미팅
- Google HAWQ, 하나 뭐였지...
- 논문 리뷰했던 미팅 클로바노트 공유
https://clovanote.naver.com/s/E7H2Gf2hhiVSs
비밀번호: 6rbxgc
- 논문 제목이랑 URL 찾아서 Readme file에 Reference 항목 추가해서 갱신 필요
"Reference" 브랜치 신규 생…
-
Dear @AlexeyAB ,
I design a network and train it on my custom dataset by using darknet framework.
My network weight is about 280MB. I want to use Model compression technology
(Model Pruning, K…
-
But when the input is outside the range [quant_min,quant_max], shouldn't the gradient be 0.0 instead of 1.0?
The following code snippet sets both quant_min and quant_max to 0 and defines the input te…
-
I am not able to understand what these factorA and factorB params are in the trained network. Can someone provide a hint ..
-
I was wondering if there is a way to perform power of 2 quantization with Larq. Maybe a specific quantizer is needed? Any suggestions?