Why a separate UNetv2? - Githubissues

liuliu / swift-diffusion

BSD 3-Clause "New" or "Revised" License

429 stars 33 forks source link

Closed ghost closed 1 year ago

ghost commented 1 year ago

I thought for SD2 Unet was the same, only difference was in the text encoder which produces 1024 channels.

ghost commented 1 year ago

It looks like you are changing the number of heads

liuliu commented 1 year ago

ghost commented 1 year ago

Thanks

ghost commented 1 year ago

I was inspecting sd_v2.1_f16.ckpt ( from your website ) and i saw that the weights were not matching with the torch version at https://huggingface.co/stabilityai/stable-diffusion-2-1/blob/main/v2-1_768-ema-pruned.ckpt .

which torch file did you convert it from?

ghost commented 1 year ago

liuliu commented 1 year ago

Yeah, 768-v file are labelled as sd_v2.1_768_v_f16.ckpt