-
你好作者,我在跑你们训练的时候遇到了这个问题,请问有解决的方式吗?
/home/amax/anaconda3/bin/conda run -n WalMaFa --no-capture-output python /data1/WalMaFa/train.py
load training yaml file: ./configs/LOL/train/training_LOL.yaml
==…
-
I met a problem when training on my single RTX 4090. The predicted target occurs some black sub-images when training after 36k steps. The learning rate is set to 5e-5 and batch size is 64.
Can you gi…
-
## ❓ Questions and Help
Hi All,
I Have this code
```
import optuna
from torch.optim.lr_scheduler import ReduceLROnPlateau
# Assuming dataset is already defined
train_size = int(0.8 * len(da…
-
I have adapted this using simple Data-Parallel from Pytorch, but the model seems to output ``nans sometimes. Have you been able to train this across multiple GPUs on a single node?
-
Traceback (most recent call last):
File "/home/jiayi/lmms-finetune-main/train.py", line 248, in
train()
File "/home/jiayi/lmms-finetune-main/train.py", line 240, in train
trainer.trai…
-
### Checklist
- [X] I've looked through the [README](https://github.com/hbatalhaStch/react-big-scheduler#readme)
- [X] I've verified that I'm running react-big-scheduler-stch version **1.3.0**
### P…
-
Sometimes, when training using the SimCLR method I get some divergent loss function (see attached screenshot). I wonder if anyone has ever experienced this kind of issue when training with SimCLR. Thi…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
- `llamafactory` version: 0.9.1.dev0
- Platform: Linux-4.19.91-014.15-kangaroo.alios7.x86_64-x86_64-with…
-
Why is `batch_size*2` passed as a parameter in the following lines instead of `batch_size`?
In `train_LEP.py`:
`noisy_image, noise_level, timesteps = noisy_latent(latent_image, pipe.scheduler, b…
-
I want to use faster-rcnn to do object detection instead of mask rcnn for splitting, now it always shows insufficient video memory, in fact my video memory still has space, maybe there is a problem wi…