-
**Request:** As a casual user without much knowledge in LLMs, it would be nice to know upfront how much disk space the models need.
**Currently:** The various posts and docs only mention that llama…
-
In my training script, I set the **per_device_train_batch_size = 4** in the TrainingArguments.
But the **train_batch_size** in the **trainer_state.json** of each checkpoint is **2**.
When I tried …
-
I have trained **Llama3.1** and it is performing well but when I ask it a question outside of the domain it is answering it instead of avoiding the question.
Is there any workaround to this?
-
When training either Llama 3 or 3.1 8B base model using the Llama 3 template for conversation prompt format, it seems to not train with the correct tokens. It ends up producing text containing tokens…
-
-
Hi there,
First thank you for unsloth, it's great!
I've finetuned a llama-3-8b-Instruct-bnb-4bit and pushed it to hf hub. When I try to deploy it using [hf Inference Endpoints](https://huggingfa…
-
### Feature request
Recent mistral models inlcuding mistral 7b v0.3 instruct have consolidated.safetensors which have different weights key names compared to what LoRAx expects. Also there are keys…
-
### System Info
1. Below are my dependencies version.
```
flash_attn==2.6.3
numpy==1.24.4
Pillow==10.4.0
Requests==2.32.3
transformers==4.44.2
accelerate==0.34.0
peft==0.12.0
datasets==2…
-
## 问题-1
你好,当我运行脚本`llava_llama3_8b_instruct_qlora_clip_vit_large_p14_336_e1_gpu1_finetune.py`后,对保存后的模型进行格式转换,`.pth` --> `xtuner`格式,文件结构如下:
这个模型结构与开源的模型文件结构不同,这是为什么?
**xtuner/llava-llama-3-8b-v1_1…
-
Would it be possible to support `apple/OpenELM-3B-Instruct` and [`apple/OpenELM-3B`](https://huggingface.co/apple/OpenELM-3B) like how Phi-3 models are supported on the ["Finetune for Free" section of…