-
### Checklist
- [ ] The issue exists after disabling all extensions
- [ ] The issue exists on a clean installation of webui
- [ ] The issue is caused by an extension, but I believe it is caused by a …
-
Traceback (most recent call last):
File "train.py", line 277, in
batch_loss_n, pred = solver.optimize(index+1,epoch)
File "/home/jayakumar/MSMDFF-NET-main/utils/frame_work_general.py", lin…
-
The problem: some systems have async work to do, which may yield. They don't want or simply can't do the work right away. For example, can be called via FFI, or want to collect a batch of such request…
-
## ❓ Questions and Help
I have trained my transformer model once on a single GPU and once using a multi-core TPU. In both cases a batchsize of 256 is used (times 8 for the TPU). My training results…
-
Can the problem be that I have GTX 1050 ti 4 GB? (playing with options to lower VRAM usage does not help), When I play with settings I get the same thing but the last thing changes to returned non-zer…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
训练命令:
llamafactory-cli train \
--stage dpo \
--do_train \
--finetuning_type full \
…
-
Dear,
Thank you for the great work! I am running your code for point cloud completion and get such an error when inferencing. I did some dedug, and realized using the code below, after self.update_…
-
Deepspeed 软件版本: 0.15.2
Transformers: 4.45.2
训练命令: deepspeed GOT/train/train_GOT.py --deepspeed zero_config/zero2.json --model_name_or_path /home/GOT-OCR2.0/GOT-OCR-2.0-master/GOT_weights --…
-
I get it at different batches in the first epoch, not always the same. But around 70-80% progress iof the first batch it seems.
```
----------------------------------------------------------------…
-
### Describe the bug
I've gone through all the steps to install Sora and the last step of running gradio/app.py it fails about 2/3 of the way. It hangs on loading shards at 0% and then get the follow…