mit-han-lab / fastcomposer

[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
https://fastcomposer.mit.edu
MIT License
658 stars 37 forks source link

The ideal value of cross-attention localization loss. #8

Closed Wuchuq closed 1 year ago

Wuchuq commented 1 year ago

Thanks for your great work! I'm quite confused about the value of cross-attention localization loss. According to the description of the loss function, when the cross-attention map is c lose to the segmentation map, the loss function should be near to -1?

Guangxuan-Xiao commented 1 year ago

Hi!

You are right. We observed the localization loss started from zero and converged to -0.6 in our experiments.

Screenshot 2023-06-09 at 11 33 48 AM

Best, Guangxuan

Wuchuq commented 1 year ago

got it! Thanks

Wuchuq commented 1 year ago

Hi! Can you tell me about the converged value of the denoise loss in your experiments?

Guangxuan-Xiao commented 1 year ago

The denoise loss converged to around 0.09 in our experiments.

Wuchuq commented 1 year ago

Thanks!