Support for smaller text encoders

mulanai / MuLan

MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)

125 stars 3 forks source link

Support for smaller text encoders #3

Open vladmandic opened 4 months ago

vladmandic commented 4 months ago

Currently MuLan internally uses OpenGVLab/InternVL-14B-224px as default text encoder While its possible to pass path to any downloadable encoder, which ones did you test?

Note that InternVL-14B-224px is a massive model at 27GB in size and requires ~17GB of VRAM to execute in FP16 context which prohibits usage of this library on any normal consumer GPU

Zeqiang-Lai commented 4 months ago

Great suggestion! Making MuLan available for every one is our ultimate goal and we will make some attempts on the smaller one.

zengjie617789 commented 4 months ago

~~I used Mini-InternVL-Chat-2B-V1-5 as text_encoder, but it loaded so slow and want to enter yes to trust_remote_code. what is the problem?~~ now, it worked.