-
Hi, I just notice in the finetune with adapter_v2, we're saving the final model with the name `lit_model.pth.adapter_v2`
```
# Save the final Adapter checkpoint at the end of training
sav…
-
I started the training using:
```
python qlora.py \
--model_name_or_path /home/nap/llm_models/llamaOG-65B-hf/ \
--output_dir ./output \
--dataset alpaca \
--do_train True \
…
-
### Your current environment
```text
The output of `python collect_env.py`
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: …
-
## 집현전 최신반 스터디
- 2022년 5월 15일 일요일 10시
- 진명훈님 전재영님 박동주님 발표
- 논문 링크: https://arxiv.org/abs/2203.15556
> ### Abstract
> We investigate the optimal model size and number of tokens for training a tr…
-
I've noticed that the generation diverges after some tokens in comparison to the HF implementation. Is this expected?
Here's how to reproduce:
**Transformers**
```python
import torch
from tra…
-
I checked this [issue](https://github.com/EleutherAI/lm-evaluation-harness/issues/714#top) has similar problem I have, however using the latest main branch doesn't solve the problem!
## Model:
- F…
-
Hello, I have a set of pretrained models, and I plan to evaluate them on the MMLU-Pro benchmark without any additional training loccaly, selecting the best-performing model for submission. Is this app…
-
"Hello, I'm trying to evaluate the GPT-4o model using the MMLU dataset, but I'm encountering an error. Could you advise me on how to proceed?"
"This is the command I used:
lm_eval --model openai…
-
There have been many discussions in the community regarding support for multiple models.
- ChatGPTNextWeb#3484
- ChatGPTNextWeb#3923
- ChatGPTNextWeb#960
- ChatGPTNextWeb#3431
- ChatGPTNextWeb#…
-
Hi, thanks for sharing this great open-source project! When using multiple GPUs for evaluation, I found partition tasks sometimes will fail due to occupied ports.
### Prerequisite
- [X] I have s…
sdc17 updated
6 months ago