-
### Motivation.
TLDR; There is high CPU overhead associated with each decode batch due to the processing and generation of input/output. Multi-step decoding will be able to amortize all these overh…
-
Hi all, I am trying to fine-tune models in extremely long contexts.
I've tested the training setup below, and I managed to finetune:
- llama3.1-1B with a max_sequence_length of 128 * 1024 tokens
…
-
**Describe the bug/ 问题描述 (Mandatory / 必填)**
IA3微调Qwen2-7b-instruct模型,在mindnlp.core.nn.modules.container.py处raise了一个错误:
![image](https://github.com/user-attachments/assets/5ef35812-ef13-4b51-8e95-a00…
-
i don't know how to run,if you can teach me ,i'd appreciate.
![image](https://github.com/user-attachments/assets/3fde4f94-0c4d-49e0-8dd2-77b9cce22630)
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
LLaMA Factory, version 0.9.1.dev0
torch,verison 2.4+cuda12.1
########################################…
-
Hi, I faced the following error when trying to run multi-stream models:
using dlc image augmentation pipeline
Error executing job with overrides: []
Traceback (most recent call last):
File "/…
-
### Nomad version
Nomad v1.1.3 (8c0c8140997329136971e66e4c2337dfcf932692)
### Operating system and Environment details
Linux and macOS, different versions.
### Issue
We run 1000+ sy…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
在total_batch_size相同的情况下,单机(8卡)训练速度和多机(16卡)一样。对于想使用这个仓库scale数据规模成了阻碍
### Reproduction
使用的torchrun调用
脚本为…
-
I'm trying to train the data:
accelerate launch --num_cpu_threads_per_process=2 "./sdxl_train_network.py"
--pretrained_model_name_or_path="/content/civitai/realismEngineSDXL…
-
# EventRadar V1 Design Specification
## 1. Overview
EventRadar is a lightweight, extensible framework for building event monitoring and processing pipelines in Elixir.
### 1.1 Core Features
- Dynam…