Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.34k stars 4.16k forks source link

在开启了vulkan时,推理阶段显示cannot create up with use_fp16_storage if not_support_fp16 storage错误 #2844

Open Eleanor456 opened 3 years ago

Eleanor456 commented 3 years ago

在我不启用vulkan时,模型可以正常使用,但是启用了vulkan加速之后

net.opt.use_vulkan_compute = true; net.set_vulkan_device(0);

,则会在extract阶段

extractor.extract("out1", out);

出现如下错误:

[0 Adreno (TM)540] queueC=[3] queueG=[3] queueT=[3] [0 Adreno (TM) 540] bugsbn1=1 bugbilz=1 bugcopc=1 bugihfa=0 [0 Adreno (TM) 540] fp16-p/s/a=1/0 /0 int8-p/s/a=1/0/0 [0 Adreno (TM) 540] subgroup=128 basic=0 vote= ballot= shuffle=0 segmentation fault (core dumped)

请问一下,要怎么解决呢,谢谢~

nihui commented 3 years ago

换别的模型也有问题吗? 以及,请上传下模型的 param 文件

Eleanor456 commented 3 years ago

换别的模型也有问题吗? 以及,请上传下模型的 param 文件

对,对于ocr模型中的dbnet,anglenet和crnnnet都有这个问题,并且在linux编译运行也存在这个vulkan开启后segment的错误。其中dbnet的param如下(下载的开源模型):

7767517 142 163 Input input0 0 1 input0 Convolution 346 1 1 input0 346 0=16 1=3 3=2 4=1 5=1 6=432 HardSwish 353 1 1 346 353 0=1.666667e-01 Split splitncnn_0 1 2 353 353_splitncnn_0 353_splitncnn_1 ConvolutionDepthWise 354 1 1 353_splitncnn_1 356 0=16 1=3 4=1 5=1 6=144 7=16 9=1 Convolution 357 1 1 356 357 0=16 1=1 5=1 6=256 BinaryOp 359 2 1 353_splitncnn_0 357 359 Convolution 360 1 1 359 362 0=64 1=1 5=1 6=1024 9=1 ConvolutionDepthWise 363 1 1 362 365 0=64 1=3 3=2 4=1 5=1 6=576 7=64 9=1 Convolution 366 1 1 365 366 0=24 1=1 5=1 6=1536 Split splitncnn_1 1 2 366 366_splitncnn_0 366_splitncnn_1 Convolution 368 1 1 366_splitncnn_1 370 0=72 1=1 5=1 6=1728 9=1 ConvolutionDepthWise 371 1 1 370 373 0=72 1=3 4=1 5=1 6=648 7=72 9=1 Convolution 374 1 1 373 374 0=24 1=1 5=1 6=1728 BinaryOp 376 2 1 366_splitncnn_0 374 376 Split splitncnn_2 1 2 376 376_splitncnn_0 376_splitncnn_1 Convolution 377 1 1 376_splitncnn_1 379 0=72 1=1 5=1 6=1728 9=1 ConvolutionDepthWise 380 1 1 379 380 0=72 1=5 3=2 4=2 5=1 6=1800 7=72 Split splitncnn_3 1 2 380 380_splitncnn_0 380_splitncnn_1 Pooling 388 1 1 380_splitncnn_1 392 0=1 4=1 InnerProduct 393 1 1 392 394 0=24 1=1 2=1728 9=1 InnerProduct 395 1 1 394 395 0=72 1=1 2=1728 HardSigmoid 400 1 1 395 400 0=1.666667e-01 BinaryOp 409 2 1 380_splitncnn_0 400 409 0=2 ReLU 410 1 1 409 410 Convolution 411 1 1 410 411 0=32 1=1 5=1 6=2304 Split splitncnn_4 1 2 411 411_splitncnn_0 411_splitncnn_1 Convolution 413 1 1 411_splitncnn_1 415 0=96 1=1 5=1 6=3072 9=1 ConvolutionDepthWise 416 1 1 415 416 0=96 1=5 4=2 5=1 6=2400 7=96 Split splitncnn_5 1 2 416 416_splitncnn_0 416_splitncnn_1 Pooling 424 1 1 416_splitncnn_1 428 0=1 4=1 InnerProduct 429 1 1 428 430 0=24 1=1 2=2304 9=1 InnerProduct 431 1 1 430 431 0=96 1=1 2=2304 HardSigmoid 436 1 1 431 436 0=1.666667e-01 BinaryOp 445 2 1 416_splitncnn_0 436 445 0=2 ReLU 446 1 1 445 446 Convolution 447 1 1 446 447 0=32 1=1 5=1 6=3072 BinaryOp 449 2 1 411_splitncnn_0 447 449 Split splitncnn_6 1 2 449 449_splitncnn_0 449_splitncnn_1 Convolution 450 1 1 449_splitncnn_1 452 0=96 1=1 5=1 6=3072 9=1 ConvolutionDepthWise 453 1 1 452 453 0=96 1=5 4=2 5=1 6=2400 7=96 Split splitncnn_7 1 2 453 453_splitncnn_0 453_splitncnn_1 Pooling 461 1 1 453_splitncnn_1 465 0=1 4=1 InnerProduct 466 1 1 465 467 0=24 1=1 2=2304 9=1 InnerProduct 468 1 1 467 468 0=96 1=1 2=2304 HardSigmoid 473 1 1 468 473 0=1.666667e-01 BinaryOp 482 2 1 453_splitncnn_0 473 482 0=2 ReLU 483 1 1 482 483 Convolution 484 1 1 483 484 0=32 1=1 5=1 6=3072 BinaryOp 486 2 1 449_splitncnn_0 484 486 Split splitncnn_8 1 2 486 486_splitncnn_0 486_splitncnn_1 Convolution 487 1 1 486_splitncnn_1 487 0=192 1=1 5=1 6=6144 HardSwish 494 1 1 487 494 0=1.666667e-01 ConvolutionDepthWise 495 1 1 494 495 0=192 1=3 3=2 4=1 5=1 6=1728 7=192 HardSwish 502 1 1 495 502 0=1.666667e-01 Convolution 503 1 1 502 503 0=64 1=1 5=1 6=12288 Split splitncnn_9 1 2 503 503_splitncnn_0 503_splitncnn_1 Convolution 505 1 1 503_splitncnn_1 505 0=160 1=1 5=1 6=10240 HardSwish 512 1 1 505 512 0=1.666667e-01 ConvolutionDepthWise 513 1 1 512 513 0=160 1=3 4=1 5=1 6=1440 7=160 HardSwish 520 1 1 513 520 0=1.666667e-01 Convolution 521 1 1 520 521 0=64 1=1 5=1 6=10240 BinaryOp 523 2 1 503_splitncnn_0 521 523 Split splitncnn_10 1 2 523 523_splitncnn_0 523_splitncnn_1 Convolution 524 1 1 523_splitncnn_1 524 0=144 1=1 5=1 6=9216 HardSwish 531 1 1 524 531 0=1.666667e-01 ConvolutionDepthWise 532 1 1 531 532 0=144 1=3 4=1 5=1 6=1296 7=144 HardSwish 539 1 1 532 539 0=1.666667e-01 Convolution 540 1 1 539 540 0=64 1=1 5=1 6=9216 BinaryOp 542 2 1 523_splitncnn_0 540 542 Split splitncnn_11 1 2 542 542_splitncnn_0 542_splitncnn_1 Convolution 543 1 1 542_splitncnn_1 543 0=144 1=1 5=1 6=9216 HardSwish 550 1 1 543 550 0=1.666667e-01 ConvolutionDepthWise 551 1 1 550 551 0=144 1=3 4=1 5=1 6=1296 7=144 HardSwish 558 1 1 551 558 0=1.666667e-01 Convolution 559 1 1 558 559 0=64 1=1 5=1 6=9216 BinaryOp 561 2 1 542_splitncnn_0 559 561 Convolution 562 1 1 561 562 0=384 1=1 5=1 6=24576 HardSwish 569 1 1 562 569 0=1.666667e-01 ConvolutionDepthWise 570 1 1 569 570 0=384 1=3 4=1 5=1 6=3456 7=384 Split splitncnn_12 1 2 570 570_splitncnn_0 570_splitncnn_1 Pooling 578 1 1 570_splitncnn_1 582 0=1 4=1 InnerProduct 583 1 1 582 584 0=96 1=1 2=36864 9=1 InnerProduct 585 1 1 584 585 0=384 1=1 2=36864 HardSigmoid 590 1 1 585 590 0=1.666667e-01 BinaryOp 599 2 1 570_splitncnn_0 590 599 0=2 HardSwish 605 1 1 599 605 0=1.666667e-01 Convolution 606 1 1 605 606 0=88 1=1 5=1 6=33792 Split splitncnn_13 1 2 606 606_splitncnn_0 606_splitncnn_1 Convolution 608 1 1 606_splitncnn_1 608 0=528 1=1 5=1 6=46464 HardSwish 615 1 1 608 615 0=1.666667e-01 ConvolutionDepthWise 616 1 1 615 616 0=528 1=3 4=1 5=1 6=4752 7=528 Split splitncnn_14 1 2 616 616_splitncnn_0 616_splitncnn_1 Pooling 624 1 1 616_splitncnn_1 628 0=1 4=1 InnerProduct 629 1 1 628 630 0=136 1=1 2=71808 9=1 InnerProduct 631 1 1 630 631 0=528 1=1 2=71808 HardSigmoid 636 1 1 631 636 0=1.666667e-01 BinaryOp 645 2 1 616_splitncnn_0 636 645 0=2 HardSwish 651 1 1 645 651 0=1.666667e-01 Convolution 652 1 1 651 652 0=88 1=1 5=1 6=46464 BinaryOp 654 2 1 606_splitncnn_0 652 654 Split splitncnn_15 1 2 654 654_splitncnn_0 654_splitncnn_1 Convolution 655 1 1 654_splitncnn_1 655 0=528 1=1 5=1 6=46464 HardSwish 662 1 1 655 662 0=1.666667e-01 ConvolutionDepthWise 663 1 1 662 663 0=528 1=5 3=2 4=2 5=1 6=13200 7=528 Split splitncnn_16 1 2 663 663_splitncnn_0 663_splitncnn_1 Pooling 671 1 1 663_splitncnn_1 675 0=1 4=1 InnerProduct 676 1 1 675 677 0=136 1=1 2=71808 9=1 InnerProduct 678 1 1 677 678 0=528 1=1 2=71808 HardSigmoid 683 1 1 678 683 0=1.666667e-01 BinaryOp 692 2 1 663_splitncnn_0 683 692 0=2 HardSwish 698 1 1 692 698 0=1.666667e-01 Convolution 699 1 1 698 699 0=120 1=1 5=1 6=63360 Convolution 701 1 1 699 703 0=24 1=1 5=1 6=2880 9=1 Split splitncnn_17 1 2 703 703_splitncnn_0 703_splitncnn_1 Convolution 704 1 1 654_splitncnn_0 706 0=24 1=1 5=1 6=2112 9=1 Interp 723 1 1 703_splitncnn_1 723 0=1 1=2.000000e+00 2=2.000000e+00 BinaryOp 724 2 1 723 706 724 Convolution 725 1 1 724 727 0=24 1=3 4=1 5=1 6=5184 9=1 Split splitncnn_18 1 2 727 727_splitncnn_0 727_splitncnn_1 Convolution 728 1 1 486_splitncnn_0 730 0=24 1=1 5=1 6=768 9=1 Interp 747 1 1 727_splitncnn_1 747 0=1 1=2.000000e+00 2=2.000000e+00 BinaryOp 748 2 1 747 730 748 Convolution 749 1 1 748 751 0=24 1=3 4=1 5=1 6=5184 9=1 Split splitncnn_19 1 2 751 751_splitncnn_0 751_splitncnn_1 Convolution 752 1 1 376_splitncnn_0 754 0=24 1=1 5=1 6=576 9=1 Interp 771 1 1 751_splitncnn_1 771 0=1 1=2.000000e+00 2=2.000000e+00 BinaryOp 772 2 1 771 754 772 Convolution 773 1 1 772 775 0=24 1=3 4=1 5=1 6=5184 9=1 Interp 792 1 1 751_splitncnn_0 792 0=1 1=2.000000e+00 2=2.000000e+00 Interp 803 1 1 727_splitncnn_0 803 0=1 1=4.000000e+00 2=4.000000e+00 Interp 814 1 1 703_splitncnn_0 814 0=1 1=8.000000e+00 2=8.000000e+00 Concat 815 4 1 775 792 803 814 815 Convolution 816 1 1 815 818 0=96 1=3 4=1 5=1 6=82944 9=1 Split splitncnn_20 1 2 818 818_splitncnn_0 818_splitncnn_1 Convolution 819 1 1 818_splitncnn_1 821 0=24 1=3 4=1 5=1 6=20736 9=1 Deconvolution 822 1 1 821 824 0=24 1=2 3=2 5=1 6=2304 9=1 Deconvolution 825 1 1 824 826 0=1 1=2 3=2 5=1 6=96 9=4 Convolution 827 1 1 818_splitncnn_0 829 0=24 1=3 4=1 5=1 6=20736 9=1 Deconvolution 830 1 1 829 832 0=24 1=2 3=2 5=1 6=2304 9=1 Deconvolution 833 1 1 832 834 0=1 1=2 3=2 5=1 6=96 9=4 Concat out1 2 1 826 834 out1