openai / guided-diffusion

MIT License
6.06k stars 807 forks source link

Issue with Attention Resolution Preprocessing #47

Closed BardiaKh closed 2 years ago

BardiaKh commented 2 years ago

First of all, thanks for open-sourcing your implementation. Where ever you look the base implementation is OpenAI's guided diffusion, which is great!

I was going over the code for a personal project, and I understood that model config is preprocessed using the following code in script_util.py:

    for res in attention_resolutions.split(","):
        attention_ds.append(image_size // int(res))

However in the unet.py, attention_resolutions is defined as:

a collection of downsample rates at which attention will take place. May be a set, list, or tuple. For example, if this contains 4, then at 4x downsampling, attention will be used.

Which means the implementation is independent of the image resolution, which totally makes sense.

The only thing that needs to be changed to fix this discrepancy is to change the code snippet above to:

    for res in attention_resolutions.split(","):
        attention_ds.append(int(res))

I would be more than happy to submit a PR, but first wanted to bring this to your attention and seek your opinion.