-
在执行stag1.sh的时候,出现以下报错:
```
module._apply(fn)
File "/root/miniconda3/envs/graphgpt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 639, in _apply
module._apply(fn)
[Previ…
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
###…
-
### Question
### **ValueError: MPTForCausalLM only supports tied word embeddings**
here is the terminal error...
```
(env) root@4acd6379ec91:/workspace/reverse_prompting/LLaVA# torchrun --nn…
-
いつも開発お疲れ様です。
今回はSD3.5Lで複数概念が全く学習されなかったのでバグ報告させていただきます。
私はいままで10以上の概念を持つLoraをSD1.5やSDXLで学習させて作成した経験があります。
今回も同じようにSD3.5Lで10以上の概念を学習させようと何度もテストしましたが、まったくうまくいきません。
そこで試しに4つの概念を同時に学習させてちゃんと学習内容が反映…
-
██████████████████████████████████████████████████████| 3/3 [01:10
-
Training for 50 epochs on CIFAR-10 with
```
OMP_NUM_THREADS=1 python -m torch.distributed.launch --nproc_per_node=1 train.py --num_workers 4 --batch_size 128 --epochs 50
```
and then boosting with…
-
This the script I used for fine tuning.
```
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
export PDSH_RCMD_TYPE=ssh
# NCCL setting
export GLOO_SOCKET_IFNAME=bond0
export NCCL_SO…
-
The original code does not change, after increasing the amount of training data, when training PBAFN_e2e code, training to 78 epoch, the network neither runs nor reports errors, and is at a standstill…
-
## Description
This issue aims to emphasize the significance of introducing support for ClickHouse migrations to enhance the overall development experience, facilitate efficient evolution and maint…
-
**Please edit the OP** to add whatever fixes we applied to the core and which need to be propagated upstream into:
1. https://github.com/microsoft/Megatron-DeepSpeed
2. https://github.com/NVIDIA/Meg…