yeungchenwa / FontDiffuser

[AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Learning
https://yeungchenwa.github.io/fontdiffuser-homepage/
224 stars 21 forks source link

does it support few-shot? #49

Open jackwolfey opened 1 month ago

jackwolfey commented 1 month ago

Does the model's inference support few-shots, or is there a way for the model to take features from multiple reference characters and then inference? Nice work by the way.

yeungchenwa commented 1 month ago

hi@jackwolfey, thanks for your attention~ You can concatenate multiple reference character features into a feature sequence and then inject into our Denoiser (UNet) through cross-attention. It just needs to modify the style hidden state dimension.

jackwolfey commented 1 month ago

@yeungchenwa Thanks for your reply. I tried to locate the code that needs to be modified, as you mentioned below, but I actually don't know how to modify it. I am a beginner in the AI programming area, so could you please provide me with some detailed information or instructions on how to modify it? Thank you very much. pycharm64_2nkJhTGUjc