-
Hi,
I am a M.Sc. student and I am implementing network pruning/compression from the `Learning both Weights and Connections for Efficient Neural Networks` paper as my final project. I am using Torc…
-
- [ ] [MoAI/README.md at master · ByungKwanLee/MoAI](https://github.com/ByungKwanLee/MoAI/blob/master/README.md?plain=1)
# MoAI/README.md at master · ByungKwanLee/MoAI
## Description
![MoAI: Mixture…
-
A commonly requested feature for Boost.Histogram in C++ is to convert from and to ROOT histograms. In Python, we can do that already now with aghast, but not from C++. Calling into a Python library fr…
-
Does anyone have a benchmarks results between tiny-cnn and other SWs like Caffe, Theano, cuDNN etc.
for example for:
1. small networks (where I hope tiny-CNN should be better than others)
2. big netwo…
-
1. [1st Place Solution to Google Landmark Retrieval 2020](https://storage.googleapis.com/kaggle-forum-message-attachments/978542/16699/1st_Place_Solution_to_Google_Landmark_Retrieval_2020_modified.pdf…
-
### 🚀 The feature, motivation and pitch
PPO and a number of other LLM fine-tuning techniques require autoregressive generation as part of the training process. When using vLLM to speed up the autor…
-
### System Info
Dear authors,
I have a question regarding the training time utilizing the peft package. I tried using LoRA with a swin transformer to reduce the parameter size.
```
model = Swi…
-
- [ ] [Answer.AI - You can now train a 70b language model at home](https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html)
# Answer.AI - You can now train a 70b language model at home
**DESCRIPTION:…
-
### Within 7 days Conferences
- ACM WSDM(Web Search and Deep Mining) 2023 https://www.wsdm-conference.org/2023/
> 2/27~3/3, Singapore
- NDSS (Network and Distributed System Security Symposium) http…
-
When I load the model as following, throw the error: Cannot merge LORA layers when the model is loaded in 8-bit mode
How can I load model with 4bit when inferencing?
`
model_path = 'decapoda-resea…