lllyasviel / Paints-UNDO

Understand Human Behavior to Align True Needs
Apache License 2.0
3.49k stars 310 forks source link

Support concat of clip outputs for longer prompt #48

Open KohakuBlueleaf opened 4 months ago

KohakuBlueleaf commented 4 months ago

wd14 tagger usually give us 77~152 tokens prompt. But current implementation only supports 77 token length.

I try to implement an adaptive implementation which can output longer prompt when needed. I'm not sure if this implementation is ok but it works well for me.

Some concerns:

  1. Should I drop/mask all the padding tokens?
  2. Should max length be an user option?