Open rom1504 opened 2 years ago
Here's some relevant sections of the paper for reference while in this thread
they are also using the BSR degradation used by Rombach et al https://github.com/CompVis/latent-diffusion/tree/e66308c7f2e64cb581c6d27ab6fbeb846828253b/ldm/modules/image_degradation https://github.com/cszn/BSRGAN/blob/main/utils/utils_blindsr.py that I don't have in the repository yet
tempted to just go with Imagen's noising procedure (on top of the blur) and call it a day (it would be a lot simpler)
ok, 0.11.0
should allow for the different noise schedules across different unets, as in the paper
after adding the BSR image degradation (or some alternative), i think i'm comfortable giving the repository a 1.0
I understand only the image (and clip image EMB) is needed and no text ?
@rom1504 yup, no text conditioning needed, i think it should all be in the image embedding!
Hi all, I am aiming to train the decoder and upsampler. Because the decoder and upsampler have too many parameters, so I decide to train them seperately. I saw in the readme which says the upsampler and the decoder net can be trained seperately. I viewed the code, in my understanding, although I can train them seperately, I need to load the parameters of both unet 0 and unet 1 and change the unet number into 1 to train only unet 1. I don't know if I am right. If so, I couldn't train unet0 and unet 1 in two seperate machines. I am wondering how I could train the decoder net and upsamplers seperately? Best,
Hey, so we got decent versions of the prior and the basic decoder now.
I think the current code is already able to train upscalers but we need more doc for it.
Let's have a upscaler.md explaining
And then train it!
We can also discuss what's the right dataset, but I figure the laion5B subset we call "laion high resolution" could do the trick (it's 170M images in 1024x1024 or bigger)
I understand only the image (and clip image EMB) is needed and no text ?