apple / ml-mdm

Train high-quality text-to-image diffusion models in a data & compute efficient manner
https://machinelearning.apple.com/research/matryoshka-diffusion-models
MIT License
444 stars 31 forks source link

Fix some small things #18

Closed tolgacangoz closed 5 days ago

tolgacangoz commented 1 month ago

Proposes to fix #10

  1. This pull request updates the cc12m_256x256.yaml and cc12m_1024x1024.yaml files by adding a diffusion_config section because samplers are created with diffusion_config.sampler_config: https://github.com/apple/ml-mdm/blob/36959179b5c103cfe014c2ddaef91c6a24feefbe/ml_mdm/diffusion.py#L94 https://github.com/apple/ml-mdm/blob/36959179b5c103cfe014c2ddaef91c6a24feefbe/ml_mdm/diffusion.py#L287

  2. Also, diffusion_config.no_use_residual is needed: https://github.com/apple/ml-mdm/blob/36959179b5c103cfe014c2ddaef91c6a24feefbe/ml_mdm/diffusion.py#L262

  3. None in the file is read by 'None' -as a string.

Proposes to fix #6

Fixes old module mimicry.

This pull request proposes a refactoring of the samplers.py file to ensure correct handling of scaling when the config.schedule_shifted flag is set to True. This change improves the behavior of the code, especially for higher resolutions.

Prompt = a blue jay stops on the top of a helmet of Japanese samurai, background with sakura tree Guidance scale = 7.5 Thresholding = clip Number of steps = 250 Before 1024x1024 After 1024x1024
before after

@MultiPath @Shuangfei @luke-carlson

MultiPath commented 1 month ago

https://github.com/apple/ml-mdm/pull/21

Thanks @tolgacangoz I think your PR fixed most of the issues. Please check the above my proposal (i think it did not fix the typos you mentioned)

tolgacangoz commented 1 month ago

This PR now fixes some small nits.