alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.59k stars 1.65k forks source link

run diffusion demo error #2968

Open LHB116 opened 1 month ago

LHB116 commented 1 month ago
  1. MNN版本:2.9.3

  2. 编译与运行参考 https://mnn-docs.readthedocs.io/en/latest/transformers/diffusion.html

  3. 编译MNN和MNN转换工具 cmake .. -DMNN_BUILD_DIFFUSION=ON -DMNN_BUILD_OPENCV=ON -DMNN_IMGCODECS=ON -DMNN_OPENCL=ON -DMNN_SEP_BUILD=OFF -DMNN_SUPPORT_TRANSFORMER_FUSE=ON -DMNN_BUILD_CONVERTER=ON -DMNN_BUILD_TORCH=ON

  4. diffusion模型torch转onnx cd mnn_path/transformers/diffusion/ python export/onnx_export.py \ --model_path "IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1" \ --output_path onnx_save_path

  5. onnx2mnn ./MNNConvert -f ONNX --modelFile onnx_save_path/text_encoder/model.onnx --MNNModel mnn_save_path/text_encoder.mnn --weightQuantBits 8 --bizCode biz ./MNNConvert -f ONNX --modelFile onnx_save_path/unet/model.onnx --MNNModel mnn_save_path/unet.mnn --transformerFuse --weightQuantBits 8 --bizCode biz ./MNNConvert -f ONNX --modelFile onnx_save_path/vae_decoder/model.onnx --keepInputFormat --MNNModel mnn_save_path/vae_decoder.mnn --weightQuantBits 8 --bizCode biz

  6. ./diffusion_demo mnn_taiyi/ 1 demo.jpg "一只可爱的猫" model resource path: mnn_taiyi/ model type is stable diffusion taiyi chinese version output img_name: demo.jpg input texts: 一只可爱的猫 CPU Group: [ 11 6 2 14 10 7 3 15 ], 1200000 - 4500000 CPU Group: [ 8 4 0 12 ], 1200000 - 4600000 CPU Group: [ 9 5 1 13 ], 1200000 - 4700000 The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0 Can't open file:.tempcache Load Cache file error. Model loading and initilizing... First time initilizing may cost a few seconds to create cachefile, please wait ... [##-] [ 66%]Map error biasPtrCL == nullptr Map error filterPtrCL == nullptr Map error biasPtrCL == nullptr Map error ptrCL == nullptr clBuffer map error! Segmentation fault 请问如何解决这个问题,第6步前都没error

bitxsw93 commented 1 month ago

用的是什么机器,内存有多大。可能是内存不够用了

LHB116 commented 1 month ago

用的是什么机器,内存有多大。可能是内存不够用了

用的服务器跑的,理论上内存够的,之前用mnn-stable-diffusion这个仓库跑的没问题,看到MNN2.8.0优化了Transformer算子想看看效果来着。

Basicname commented 1 month ago

在rk3588 opencl上遇到了同样的问题

Basicname commented 1 month ago

@#2901 是否需要等到mnn 3.0发布?

bitxsw93 commented 1 month ago

@LHB116 NVIDIA的服务器opencl可能不支持fp16,之前mnn-stable-diffusion仓库跑的是哪个后端?

LHB116 commented 1 month ago

@LHB116 NVIDIA的服务器opencl可能不支持fp16,之前mnn-stable-diffusion仓库跑的是哪个后端?

应该是cpu上跑的(Can't Find type=2 backend, use 0 instead),我把后端改为cpu(MNN_FORWARD_CPU),出现Segmentation fault的错误

model resource path: mnn_taiyi model type is stable diffusion taiyi chinese version output img_name: demo.jpg input texts: 一只可爱的猫 CPU Group: [ 11 6 2 14 10 7 3 15 ], 1200000 - 4500000 CPU Group: [ 8 4 0 12 ], 1200000 - 4600000 CPU Group: [ 9 5 1 13 ], 1200000 - 4700000 The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0 Model loading and initilizing... First time initilizing may cost a few seconds to create cachefile, please wait ... [#--] [ 33%]Segmentation fault

bitxsw93 commented 1 month ago

@LHB116 NVIDIA的服务器opencl可能不支持fp16,之前mnn-stable-diffusion仓库跑的是哪个后端?

应该是cpu上跑的(Can't Find type=2 backend, use 0 instead),我把后端改为cpu(MNN_FORWARD_CPU),出现Segmentation fault的错误

model resource path: mnn_taiyi model type is stable diffusion taiyi chinese version output img_name: demo.jpg input texts: 一只可爱的猫 CPU Group: [ 11 6 2 14 10 7 3 15 ], 1200000 - 4500000 CPU Group: [ 8 4 0 12 ], 1200000 - 4600000 CPU Group: [ 9 5 1 13 ], 1200000 - 4700000 The device supports: i8sdot:0, fp16:0, i8mm: 0, sve2: 0 Model loading and initilizing... First time initilizing may cost a few seconds to create cachefile, please wait ... [#--] [ 33%]Segmentation fault

CPU后端 目前部分transformer算子没有实现,需要在MNNConverter编译时关掉相应宏 -DMNN_SUPPORT_TRANSFORMER_FUSE=OFF