Open TemporalLabsLLC-SOL opened 2 years ago
i have the same problem, do you solve it?
add --gpus=1 , it works
same problem
same problem
There are a couple known fixes depending on your specific env. I can compile some links later but use the search function too.
On Wed, Jan 11, 2023, 11:53 PM howardgriffin @.***> wrote:
same problem
— Reply to this email directly, view it on GitHub https://github.com/XavierXiao/Dreambooth-Stable-Diffusion/issues/53#issuecomment-1379887964, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALL43Q4IJ3SA4RAKBINEEHLWR6S7BANCNFSM6AAAAAAQ4RJOIA . You are receiving this because you authored the thread.Message ID: @.***>
@xzdong-2019, may I ask how did you solve that? I mean where should us add gpus=1?
@xzdong-2019, may I ask how did you solve that? I mean where should us add gpus=1?
python main.py --gpus 0, --prompt ....
it works to me
C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loggers\test_tube.py:104: LightningDeprecationWarning: The TestTubeLogger is deprecated since v1.5 and will be removed in v1.7. We recommend switching to the
pytorch_lightning.loggers.TensorBoardLoggeras an alternative. rank_zero_deprecation( Monitoring val/loss_simple_ema as checkpoint metric. Merged modelckpt-cfg: {'target': 'pytorch_lightning.callbacks.ModelCheckpoint', 'params': {'dirpath': 'logs\\SUBJECT2022-10-04T06-25-48_DSU90\\checkpoints', 'filename': '{epoch:06}', 'verbose': True, 'save_last': True, 'monitor': 'val/loss_simple_ema', 'save_top_k': 1, 'every_n_train_steps': 500}} GPU available: True, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py:1584: UserWarning: GPU available but not used. Set the gpus flag in your trainer
Trainer(gpus=1)or script
--gpus=1`. rank_zero_warn(Data
train, PersonalizedBase, 1500 reg, PersonalizedBase, 15000 validation, PersonalizedBase, 15 accumulate_grad_batches = 1 ++++ NOT USING LR SCALING ++++ Setting learning rate to 1.00e-06 C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:275: LightningDeprecationWarning: The
on_keyboard_interrupt
callback hook was deprecated in v1.5 and will be removed in v1.7. Please use theon_exception
callback hook instead. rank_zero_deprecation( C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:284: LightningDeprecationWarning: BaseLightningModule.on_train_batch_start
hook signature has changed in v1.5. Thedataloader_idx
argument will be removed in v1.7. rank_zero_deprecation( C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\configuration_validator.py:291: LightningDeprecationWarning: BaseCallback.on_train_batch_end
hook signature has changed in v1.5. Thedataloader_idx
argument will be removed in v1.7. rank_zero_deprecation( C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\core\datamodule.py:469: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup. rank_zero_deprecation( LatentDiffusion: Also optimizing conditioner params! Project config model: base_learning_rate: 1.0e-06 target: ldm.models.diffusion.ddpm.LatentDiffusion params: reg_weight: 1.0 linear_start: 0.00085 linear_end: 0.012 num_timesteps_cond: 1 log_every_t: 200 timesteps: 1000 first_stage_key: image cond_stage_key: caption image_size: 64 channels: 4 cond_stage_trainable: true conditioning_key: crossattn monitor: val/loss_simple_ema scale_factor: 0.18215 use_ema: false embedding_reg_weight: 0.0 unfreeze_model: true model_lr: 1.0e-06 personalization_config: target: ldm.modules.embedding_manager.EmbeddingManager params: placeholder_strings:Lightning config modelcheckpoint: params: every_n_train_steps: 500 callbacks: image_logger: target: main.ImageLogger params: batch_frequency: 200 max_images: 8 increase_log_steps: false trainer: benchmark: true max_steps: 800 gpus: 0
| Name | Type | Params
0 | model | DiffusionWrapper | 859 M 1 | first_stage_model | AutoencoderKL | 83.7 M 2 | cond_stage_model | FrozenCLIPEmbedder | 123 M
982 M Trainable params 83.7 M Non-trainable params 1.1 B Total params 4,264.941 Total estimated model params size (MB) Validation sanity check: 0it [00:00, ?it/s]C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\data_loading.py:132: UserWarning: The dataloader, val_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the
num_workers
argument(try 8 which is the number of cpus on this machine) in the
DataLoader` init to improve performance. rank_zero_warn( Validation sanity check: 0%| | 0/2 [00:00<?, ?it/s]Summoning checkpoint.Traceback (most recent call last): File "main.py", line 838, in
trainer.fit(model, data)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 740, in fit
self._call_and_handle_interrupt(
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1199, in _run
self._dispatch()
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1289, in run_stage
return self._run_train()
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1311, in _run_train
self._run_sanity_check(self.lightning_module)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1375, in _run_sanity_check
self._evaluation_loop.run()
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run
self.advance(*args, *kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\dataloader\evaluation_loop.py", line 110, in advance
dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\base.py", line 145, in run
self.advance(args, kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 122, in advance
output = self._evaluation_step(batch, batch_idx, dataloader_idx)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\loops\epoch\evaluation_epoch_loop.py", line 217, in _evaluation_step
output = self.trainer.accelerator.validation_step(step_kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\accelerators\accelerator.py", line 236, in validation_step
return self.training_type_plugin.validation_step(step_kwargs.values())
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py", line 219, in validation_step
return self.model.validation_step(args, kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, *kwargs)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 368, in validationstep
, loss_dict_no_ema = self.shared_step(batch)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 908, in shared_step
loss = self(x, c)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 937, in forward
c = self.get_learned_conditioning(c)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\models\diffusion\ddpm.py", line 595, in get_learned_conditioning
c = self.cond_stage_model.encode(c, embedding_manager=self.embedding_manager)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 324, in encode
return self(text, kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 319, in forward
z = self.transformer(input_ids=tokens, kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 297, in transformer_forward
return self.text_model(
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, kwargs)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 258, in text_encoder_forward
hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids, embedding_manager=embedding_manager)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, *kwargs)
File "C:\Users\Urban\Desktop\Dreambooth-SD-optimized-main\ldm\modules\encoders\modules.py", line 180, in embedding_forward
inputs_embeds = self.token_embedding(input_ids)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(input, kwargs)
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\modules\sparse.py", line 158, in forward
return F.embedding(
File "C:\Users\Urban\anaconda3\envs\ldm\lib\site-packages\torch\nn\functional.py", line 2199, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper__index_select)`
I'm in need of any perspective anybody can give on making this compatible for the right calls on a windows environment where WSL is not an option.