-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
[2024-09-17 10:58:53,418] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda…
-
# LoRA: Low-Rank Adaptation of Large Language Models
基于large pre-trained model,把基于某个任务的微调存储在低秩矩阵对中,low intrinsic dimension $r=4$ 就够。
Pro:
- 并行化不影响速度、任务特化的信息相对很少。
- 该方法对超参数极其不敏感。
另外:
- 对于模型…
-
General
- [ ] README.md may list directory layout and what is contained in each folder. Alternatively, each folder should have a self-explanatory name so documentation is not needed. e.g. "data" f…
-
**As a** LLM researcher
**I need** to develop standardized testing protocols for fine-tuning LLMs
**So that** I can evaluate the effectiveness and efficiency of different fine-tuning strategies
###…
-
# Offline Alternative to Google's Read Along App in Hindi
## Description
Develop an offline application (POC - web) that can display a set of Hindi words and accurately determine if the user has p…
-
**Describe the bug**
I'm using 24 A100 (40G) video cards for llama-2-70B training, and previously had a lot of OOM issues with deepspeed ZeRO-3, so I'm currently using multi-card parameter parallel…
-
- [ ] [S-LoRA: Serving Thousands of Models From One GPU for Fun and Profit - OpenPipe](https://openpipe.ai/blog/s-lora)
# S-LoRA: Serving Thousands of Models From One GPU for Fun and Profit - OpenPi…
-
*Sent by Google Scholar Alerts (scholaralerts-noreply@google.com). Created by [fire](https://fire.fundersclub.com/).*
---
###
###
### [PDF] [Timo: Towards Better Temporal Reasoning for Language M…
-
Hi there,
Was the Conifier model trained using full-parameter fine-tuning, or was a parameter-efficient method such as (Q)LoRA employed?
Thanks in advance for the clarification!
-
### 🐛 Describe the bug
when trying to train both LoRA layers on the base model and also set modules_to_save on the lora config which makes the embeddings layers trainable (my assumption is it also ap…