-
Hi,
I've seen a lot of tutorial of Dspy already for english,I wanted to give it a try for bangla,The code below gives me wrong answer for all the questions that i ask to my Dspy based RAG system,I ne…
-
**Title of the talk/workshop**
The Guide to Building Open Indic LLMs Today
**Abstract of the talk/workshop**
- Steps in training modern LLMs
- Challenges specific to Indic Models (Tokeni…
-
这个训练上比FA2更快,而且vram占用更少,在llama2的测试上非常有效。然而还不支持原生的qwen2,尽管有些三方脚本支持llamafy qwen,但是因为潜在的实现错误风险,让人不太有尝试欲望。unsloth已经被集成到llamafactory中。
benchmark:https://unsloth.ai/blog/mistral-benchmark#Benchmark%20tabl…
-
When using the natural language interrogator, joycaption, it's bringing up an error message:
![Capture](https://github.com/user-attachments/assets/d9e07f22-31ad-4584-9529-c962b7c3639c)
'Failed t…
-
In the [notebook](https://colab.research.google.com/drive/1fxDWAfPIbC-bHwDSVj5SBmEJ6KG3bUu5?usp=sharing#scrollTo=LjY75GoYUCB8) where you mentioned about how absence of `` token affects the training lo…
AvisP updated
3 weeks ago
-
I'm assuming it only works on Ampere, Hopper, Lovelace. Is that correct? It might be nice to specify in the readme, if it is limited to certain GPU types.
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
Latest version, Ubuntu 24.04
### Reproduction
Run Llama Pro and Lora together for finetuning on a model…
-
RuntimeError: [enforce fail at inline_container.cc:595] . unexpected pos 3072467392 vs 3072467280
setup:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth impor…
-
Hi,
I've finetuned llama 3.1 8b with Unsloth, but I get an unhandled exception when running inference. This seems related to the bugfix I saw in 2024.8, perhaps there's more missing there?
Here'…
ztick updated
3 months ago
-
pytorch:2.3.0
cuda:11.8
flash-attn:2.5.9.post1
python 3.10
unsloth是pip install git+https://github.com/yangjianxin1/unsloth.git 这样下的
不开unsloth可以跑,开了之后max_length改到512,per device_train_bat…