Closed Hartmon8 closed 2 months ago
Thanks for sending an issue! Here are some tips for you:
If this is your first time, please read our contributor guidelines: https://github.com/mindspore-ai/mindspore/blob/master/CONTRIBUTING.md
Ascend
正常图片:
异常图片:
输出正常的图片
配置文件:
version: SDXL-base-1.0 model: target: gm.models.diffusion.DiffusionEngine params: disable_first_stage_amp: True scale_factor: 0.5 latents_mean: - -1.6574 - 1.886 - -1.383 - 2.5155 latents_std: - 8.4927 - 5.9022 - 6.5498 - 5.2299 denoiser_config: target: gm.modules.diffusionmodules.denoiser.Denoiser params: weighting_config: target: gm.modules.diffusionmodules.denoiser_weighting.EDMWeighting params: sigma_data: 0.5 scaling_config: target: gm.modules.diffusionmodules.denoiser_scaling.EDMScaling params: sigma_data: 0.5 network_config: target: gm.modules.diffusionmodules.openaimodel.UNetModel params: adm_in_channels: 2816 num_classes: sequential in_channels: 4 out_channels: 4 model_channels: 320 attention_resolutions: [4, 2] num_res_blocks: 2 channel_mult: [1, 2, 4] num_head_channels: 64 use_spatial_transformer: True use_linear_in_transformer: True transformer_depth: [1, 2, 10] # note: the first is unused (due to attn_res starting at 2) 32, 16, 8 --> 64, 32, 16 context_dim: 2048 spatial_transformer_attn_type: flash-attention # vanilla, flash-attention legacy: False use_recompute: True conditioner_config: target: gm.modules.GeneralConditioner params: emb_models: # crossattn cond - is_trainable: False input_key: txt target: gm.modules.embedders.modules.FrozenCLIPEmbedder params: layer: hidden layer_idx: 11 version: /data/sdtest/models/models--openai--clip-vit-large-patch14/snapshots/8d052a0f05efbaefbc9e8786ba291cfdf93e5bff # pretrained: '' # crossattn and vector cond - is_trainable: False input_key: txt target: gm.modules.embedders.modules.FrozenOpenCLIPEmbedder2 params: arch: ViT-bigG-14-Text freeze: True layer: penultimate always_return_pooled: True legacy: False require_pretrained: False # pretrained: '' # laion2b_s32b_b79k.ckpt # vector cond - is_trainable: False input_key: original_size_as_tuple target: gm.modules.embedders.modules.ConcatTimestepEmbedderND params: outdim: 256 # multiplied by two # vector cond - is_trainable: False input_key: crop_coords_top_left target: gm.modules.embedders.modules.ConcatTimestepEmbedderND params: outdim: 256 # multiplied by two # vector cond - is_trainable: False input_key: target_size_as_tuple target: gm.modules.embedders.modules.ConcatTimestepEmbedderND params: outdim: 256 # multiplied by two first_stage_config: target: gm.models.autoencoder.AutoencoderKLInferenceWrapper params: embed_dim: 4 monitor: val/rec_loss ddconfig: attn_type: vanilla double_z: true z_channels: 4 resolution: 256 in_channels: 3 out_ch: 3 ch: 128 ch_mult: [1, 2, 4, 4] num_res_blocks: 2 attn_resolutions: [] dropout: 0.0 lossconfig: target: mindspore.nn.Identity sigma_sampler_config: target: gm.modules.diffusionmodules.sigma_sampling.EDMSampling params: p_mean: 0 p_std: 0.6 loss_fn_config: target: gm.modules.diffusionmodules.loss.StandardDiffusionLoss optim: base_learning_rate: 1e-6 optimizer_config: target: mindspore.nn.AdamWeightDecay # mindspore.nn.SGD params: beta1: 0.9 beta2: 0.999 weight_decay: 0.01 scheduler_config: target: gm.lr_scheduler.LambdaWarmUpScheduler params: warm_up_steps: 50 # scheduler_config: # target: gm.lr_scheduler.LambdaWarmUpCosineScheduler # params: # warm_up_steps: 62 # lr_min: 0.0 # lr_max: 1.0 # lr_start: 0.0 # max_decay_steps: -1 data: per_batch_size: 3 num_epochs: 20 num_parallel_workers: 32 python_multiprocessing: True shuffle: True dataset_config: target: gm.data.dataset_wds.T2I_Webdataset params: caption_key: 'text_english' target_size: 1024 transforms: - target: gm.data.mappers.Resize params: size: 1024 interpolation: 3 - target: gm.data.mappers.Rescaler params: isfloat: False - target: gm.data.mappers.AddOriginalImageSizeAsTupleAndCropToSquare - target: gm.data.mappers.RandomHorizontalFlip - target: gm.data.mappers.Transpose params: type: hwc2chw
训练命令行参数中是否用到了 --param_fp16 True 参数?
--param_fp16 True
@Hartmon8 配置文件显示为EDM训练,请问使用的预训练权重也是通过EDM训练出来的吗?
不是。关掉EDM配置选项后输出就正常了。
Thanks, 本Issue可以关闭。
Thanks for sending an issue! Here are some tips for you:
If this is your first time, please read our contributor guidelines: https://github.com/mindspore-ai/mindspore/blob/master/CONTRIBUTING.md
Hardware Environment | 硬件环境
Ascend
Software Environment | 软件环境
Describe the current behavior | 目前输出
正常图片:![6a5a764f0fbd4ef607ea0396213d3c9](https://github.com/mindspore-lab/mindone/assets/8011191/76fbb117-9f29-4beb-bc7a-a1493acb1073)
异常图片:![aebbc245ce9ca5a863aae36f0efc00d](https://github.com/mindspore-lab/mindone/assets/8011191/8e9f5721-53cb-485f-91a6-100caed27493)
Describe the expected behavior | 期望输出
输出正常的图片
Steps to reproduce the issue | 复现报错的步骤
配置文件:
Related log / screenshot | 完整日志
Special notes for this issue | 其他信息