-
2024-11-05 09:04:41 | INFO | yolox.core.vid_trainer:240 - ---> start train epoch1
2024-11-05 09:04:41 | INFO | yolox.core.vid_trainer:235 - Training of experiment is done and the best AP is 0…
-
### System Info
- `transformers` version: 4.44.2
- Platform: macOS-14.4-arm64-arm-64bit
- Python version: 3.12.2
- Huggingface_hub version: 0.24.5
- Safetensors version: 0.4.3
- Accelerate versi…
-
**Description**:
Hello, I encountered a `torch.cuda.OutOfMemoryError` while fine-tuning a model using `trainer.py`. My setup includes only a single GPU with 32GB of memory, and the error occurs eve…
-
## 环境信息
- GPU:A100
- 显存:40G
- SWIFT版本:v2.5.2
## 训练脚本
```
CUDA_VISIBLE_DEVICES=0 PYTORCH_CUDA_ALLOC_CONF="expandable_segments:True" swift sft \
--model_type llama3_2-11b-vision-instruct …
-
### Bug description
I was able to fine-tune a 8B LLM using Huggingface training framework with PEFT+DeepSpeed stage 2 under fp16 precision(mixed precision training). Recently I would like to change…
-
Hello.
I have been using qli-Client 2.2.1 for quite some time without any problems.
I am running it on wsl2 of windows 11 with an ubuntu 22.04.5.
I mine with my nvidia 4090 GPU.
For the drivers,…
-
### Description
Hello, when I run the example files provided by the [official document] (https://github.com/Toni-SM/skrl/blob/main/docs/source/examples/deepmind/dm_manipulation_stack_sac.py), an erro…
-
Hello, I tried to train Llama3.2 3B. It's a full finetune, not a lora, but Unsloth always crashes under varying conditions when the model should be saved. Hardware was runpod in all cases, different c…
-
**Describe the bug**
```
TypeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 trainer.fit(
2 train_dataset=filtered_dataset.train,
…
-
Hi thanks for the library! This is like a discussion (instead of an issue). It seems that when using unsloth or huggingface Trainer to full finetune ~1B model, the gpu utilization is >90%, while memor…