cloneofsimo / lora

Using Low-rank adaptation to quickly fine-tune diffusion models.
https://arxiv.org/abs/2106.09685
Apache License 2.0
6.99k stars 481 forks source link

Face seg mask is not in [0, 1] #255

Open nom opened 1 year ago

nom commented 1 year ago

Hi, I've been looking at the face segmentation masking feature. IIUC it should work such that only the face is being learnt and nothing else. So the loss of non-face pixels should be masked to 0.

However, when running the example use_face_conditioning_example.sh, and printing the mask values after https://github.com/cloneofsimo/lora/blob/bdd51b04c49fa90a88919a19850ec3b4cf3c5ecd/lora_diffusion/cli_lora_pti.py#L342-L358 with torch.unique(mask, return_counts=True), I'm seeing the lowest value is 0.35. (tensor([0.3522, 0.3603, 0.4087, 0.4490, 1.0000], device='cuda:0'), tensor([6204, 1, 13, 13, 169], device='cuda:0'))

I see that the mask values are being adjusted here https://github.com/cloneofsimo/lora/blob/master/lora_diffusion/dataset.py#L288-L295 They're first being normalized to 0.5 mean, 0.5 std, and then multiplied by 0.5 and 1 is added, resulting in a 1.25 mean. Is this intended?