Great! 点赞！非常有创意和实用价值的项目！就是 InternVL-14B-224px 模型实在太大，建议换成小一些的。

wikeeyang commented 4 months ago

SD12、SD21执行可以，就是出图质量已经跟不太上了，SDXL中的模型： adapter_path='../ckpts/adapter/sdxl/sdxl_internvl_unet_transformer_dual_xl_1024_ema_random_drop_9000.pth',

去哪找？

还有就是 SDXL、PixArt 模型不错，就是一般的 24GB 显卡不够跑啊。。。我看 IntrenVL 有研究 Stable Cascade，这个模型出图质量很不错，提示词遵循性也很好，能不能整合进来。

wikeeyang commented 4 months ago

还有 MuLan-Pixart 这个训练好的 Full finetuned model 模型怎么用？直接在 Pixart 项目中能接受各种提示词？

wikeeyang commented 4 months ago

我在 meta.json 配置文件中，看到 Pixart 配置的都是原始模型。

Zeqiang-Lai commented 4 months ago

sdxl 的就是 https://huggingface.co/mulanai/mulan-lang-adapter/blob/main/sdxl_aesthetic.pth 可以参考更新的 sdxl 测试脚本：https://github.com/mulanai/MuLan/blob/main/examples/sdxl.py pixart full finetuned model 我上传了一个测试脚本 https://github.com/mulanai/MuLan/blob/main/examples/pixart_full.py

wikeeyang commented 4 months ago

感谢迅速的响应！点赞！多谢！我再测试测试。。。 OpenGVLab-Mini-InternVL-Chat-2B-V1-5、Cohere-Aya-23-8B 都很不错，供你们参考。。。

wikeeyang commented 4 months ago

我的环境是：Win11 x64，Python 3.11.9，torch 2.3.0，cuda 12.1，24GB P40显卡。根据项目的最新更新，测试了一下 pixart_full.py，环境不支持 flash attention，没有装，根据提示，加装了 beautifulsoup4、ftfy 依赖，脚本运行基本正常，有 cudnn 的 Warning，但跑出来的结果并不正确，如附图，脚本执行情况如下，请大佬帮忙分析一下原因，谢谢！ D:\AITest\MuLan>python pixart_full.py FlashAttention is not installed. FlashAttention is not installed. Loading pipeline components...: 0%| | 0/5 [00:00<?, ?it/s]Some weights of the model checkpoint were not used when initializing Transformer2DModel: ['caption_projection.y_embedding'] Loading pipeline components...: 20%|██████████ | 1/5 [00:00<00:00, 4.64it/s]Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 2/2 [00:00<00:00, 8.64it/s] Loading pipeline components...: 100%|██████████████████████████████████████████████████| 5/5 [00:00<00:00, 6.83it/s] 0%| | 0/20 [00:00<?, ?it/s]D:\AITest\MuLan\Python311\Lib\site-packages\torch\nn\modules\conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ..\aten\src\ATen\native\cudnn\Conv_v8.cpp:919.) return F.conv2d(input, weight, bias, self.stride, D:\AITest\MuLan\Python311\Lib\site-packages\diffusers\models\attention_processor.py:1279: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.) hidden_states = F.scaled_dot_product_attention( 100%|████████████████████████████████████████████████████████████████████████████████| 20/20 [01:35<00:00, 4.76s/it]

输出结果： pixart_full_me-pixart_1024-03

另外，由于 InternVL-14B-224px 如果不能量化加载的话，需要的 GPU 显存实在要求比较高，我想大多数的项目，不会采用 80GB 的显卡来跑 SDXL 绘画，可否考虑将语言模型 InternVL-14B-224px 与绘图模型分开部署？即 InternVL-14B-224px 部署在一个单独环境，通过 API 提供接口调用，MuLan 部署在另一个单独环境，这样可以降低单片显卡 GPU 显存的要求。当然，由于对项目的核心技术了解还不够，不知是否能实现，胡乱建议，大佬们见笑了。。。

wikeeyang commented 4 months ago

对了，补充一下，由于显存太小，上面的脚本，prompt 我只用了两句进行测试，如下： '一辆红色汽车' 'كلب على شاطئ البحر'

Zeqiang-Lai commented 4 months ago

忘记 load text encoder，这次应该可以了 https://github.com/mulanai/MuLan/blob/main/examples/pixart_full.py

Zeqiang-Lai commented 4 months ago

把 text encoder 单独部署是可以的，不过这个是工程问题，我们会找时间优化一下。

wikeeyang commented 4 months ago

感谢您的快速回复！一加载 text encoder 模型，我这环境就跑不了了，不过脚本应该是OK的了，谢谢！希望项目能取得成功，并得到广泛应用。

wikeeyang commented 4 months ago

24GB显卡，我这能正常跑出来的，也就 SD15、SD21了。下图是SD21跑两张图的结果，再多 prompt 就爆显存了。。。 sd21_me_1_2

ApolloRay commented 4 months ago

sdxl 的就是 https://huggingface.co/mulanai/mulan-lang-adapter/blob/main/sdxl_aesthetic.pth 可以参考更新的 sdxl 测试脚本：https://github.com/mulanai/MuLan/blob/main/examples/sdxl.py pixart full finetuned model 我上传了一个测试脚本 https://github.com/mulanai/MuLan/blob/main/examples/pixart_full.py

老哥啥时候相应一下你的OpenDMD，我的想法是前面接Mulan，后面接DMD，这样就可以做一个秒出图的中文生图了。但是OpenDMD训练有些问题，还希望你能恢复一下issue。

Zeqiang-Lai commented 4 months ago

sdxl 的就是 https://huggingface.co/mulanai/mulan-lang-adapter/blob/main/sdxl_aesthetic.pth 可以参考更新的 sdxl 测试脚本：https://github.com/mulanai/MuLan/blob/main/examples/sdxl.py pixart full finetuned model 我上传了一个测试脚本 https://github.com/mulanai/MuLan/blob/main/examples/pixart_full.py

老哥啥时候相应一下你的OpenDMD，我的想法是前面接Mulan，后面接DMD，这样就可以做一个秒出图的中文生图了。但是OpenDMD训练有些问题，还希望你能恢复一下issue。

你好，无需额外训练，MuLan可以直接接在DMD开源的的预训练模型上实现秒出图的中文生图。

mulanai / MuLan

Great! 点赞！非常有创意和实用价值的项目！就是 InternVL-14B-224px 模型实在太大，建议换成小一些的。 #4