-
This issues describes the high level directions that "create LLM Engine V1". We want the design to be as transparent as possible and created this issue to track progress and solicit feedback.
Goal…
-
For now, if we have several scheduler components, they will work concurrently. In some scenarios, the probability of conflicts can be relatively high, such as: high deployment water level, batch sche…
-
I am training on imageslides, and when I set the batch_size=2 in prompts.yaml,
will print error message: Error Occurred!: (512, 512, x, x),but it won't stop training,
and when I set the batch size t…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
### model
# model_name_or_path: /mnt/nas/shanzhi/eval_models/Qwen2-7B
model_name_or_path: /mnt/nas/liya…
-
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar feature requirement.
### Description
After…
-
您好,在使用finetune脚本使用指令微调数据集微调bloom-7b模型时前几个step出现:
tried to get lr value before scheduler/optimizer started stepping, returning lr=0
这个warning是什么原因呢?
bloom config为:
{
"model_type": "bloom…
-
Hi,
I've encountered following error while trying to train network on my custom dataset.
```
UnboundLocalError Traceback (most recent call last)
in ()
14 tra…
-
**Is your feature request related to a problem? Please describe.**
I'm serving a model that supports batching (`max_batch_size` > 0) and I would like to use config autocomplete, but I don't want to u…
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
### Exp…
-
**Consistency training fails to converge**
When I use examples/research_projects/consistency_training/train_cm_ct_unconditional.py to train the consistency model, I try to follow the experimental par…