I followed the tutorial and ran the training script:
python main.py --base configs/stable-diffusion/v1-finetune_unfrozen.yaml --actual_resume pre_trained_sd/sd-v1-4-full-ema.ckpt -n mycup --gpus 1, --data_root img/training --reg_data_root img/regularization --class_word cup --no-test True
However, the code suddenly stopped before running, and the output log is as follows.
rank_zero_deprecation(
Monitoring val/loss_simple_ema as checkpoint metric.
Merged modelckpt-cfg:
{'target': 'pytorch_lightning.callbacks.ModelCheckpoint', 'params': {'dirpath': 'logs/training2023-09-12T11-36-32_mycup/checkpoints', 'filename': '{epoch:06}', 'verbose': True, 'save_last': True, 'monitor': 'val/loss_simple_ema', 'save_top_k': 1, 'every_n_train_steps': 500}}
/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:286: LightningDeprecationWarning: Passing `Trainer(accelerator='ddp')` has been deprecated in v1.5 and will be removed in v1.7. Use `Trainer(strategy='ddp')` instead.
rank_zero_deprecation(
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
#### Data #####
train, PersonalizedBase, 600
reg, PersonalizedBase, 60
validation, PersonalizedBase, 6
accumulate_grad_batches = 1
++++ NOT USING LR SCALING ++++
Setting learning rate to 1.00e-06
I don’t know where the problem occurred, and I didn’t see any error reports.
I followed the tutorial and ran the training script:
python main.py --base configs/stable-diffusion/v1-finetune_unfrozen.yaml --actual_resume pre_trained_sd/sd-v1-4-full-ema.ckpt -n mycup --gpus 1, --data_root img/training --reg_data_root img/regularization --class_word cup --no-test True
However, the code suddenly stopped before running, and the output log is as follows.
I don’t know where the problem occurred, and I didn’t see any error reports.