Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.59k stars 4.18k forks source link

“layer Resize not exists or registered” Should I register an ops?How can I add a new ops? #5626

Closed Mauerrr closed 3 months ago

Mauerrr commented 3 months ago

Log

Loading model from: /home/eze2szh/third_party/ncnn/mypro/best.ncnn.param
/home/eze2szh/third_party/ncnn/mypro/best.ncnn.bin
[0 NVIDIA RTX A5000]  queueC=2[8]  queueG=0[16]  queueT=1[2]
[0 NVIDIA RTX A5000]  bugsbn1=0  bugbilz=0  bugcopc=0  bugihfa=0
[0 NVIDIA RTX A5000]  fp16-p/s/u/a=1/1/1/1  int8-p/s/u/a=1/1/1/1
[0 NVIDIA RTX A5000]  subgroup=32  basic/vote/ballot/shuffle=1/1/1/1
[0 NVIDIA RTX A5000]  fp16-8x8x16/16x8x8/16x8x16/16x16x16=0/1/1/1
[1 llvmpipe (LLVM 12.0.0, 256 bits)]  queueC=0[1]  queueG=0[1]  queueT=0[1]
[1 llvmpipe (LLVM 12.0.0, 256 bits)]  bugsbn1=0  bugbilz=0  bugcopc=0  bugihfa=0
[1 llvmpipe (LLVM 12.0.0, 256 bits)]  fp16-p/s/u/a=1/1/1/0  int8-p/s/u/a=1/1/1/0
[1 llvmpipe (LLVM 12.0.0, 256 bits)]  subgroup=8  basic/vote/ballot/shuffle=1/1/1/0
[1 llvmpipe (LLVM 12.0.0, 256 bits)]  fp16-8x8x16/16x8x8/16x8x16/16x16x16=0/0/0/0
[2 NVIDIA RTX A5000]  queueC=2[8]  queueG=0[16]  queueT=1[2]
[2 NVIDIA RTX A5000]  bugsbn1=0  bugbilz=0  bugcopc=0  bugihfa=0
[2 NVIDIA RTX A5000]  fp16-p/s/u/a=1/1/1/1  int8-p/s/u/a=1/1/1/1
[2 NVIDIA RTX A5000]  subgroup=32  basic/vote/ballot/shuffle=1/1/1/1
[2 NVIDIA RTX A5000]  fp16-8x8x16/16x8x8/16x8x16/16x16x16=0/1/1/1
layer Resize not exists or registered

Problem

I'm having a problem with "layer Resize not exists or registered". "Resize is a layer in my model, but ncnn shows that it does not exist or is not registered. How do I register this operator? Or how should I solve this problem? I am currently deploying a yolov8 pose model. I have 1x3x320x320 at input and 1x36x2100 at output. the 36 dimensions are 4 values representing box plots, 5 category scores, and 3*9 coordinate points with x and y coordinates and visibility. This is the .param.

7767517
236 280
Input                    in0                      0 1 in0
Convolution              conv_69                  1 1 in0 1 0=16 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=432
Swish                    silu_5                   1 1 1 2
Convolution              conv_70                  1 1 2 3 0=32 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=4608
Swish                    silu_6                   1 1 3 4
Convolution              conv_71                  1 1 4 5 0=32 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=1024
Swish                    silu_7                   1 1 5 6
Slice                    split_0                  1 2 6 7 8 -23300=2,16,16 1=0
Split                    splitncnn_0              1 3 8 9 10 11
Convolution              conv_72                  1 1 11 12 0=16 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=2304
Swish                    silu_8                   1 1 12 13
Convolution              conv_73                  1 1 13 14 0=16 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=2304
Swish                    silu_9                   1 1 14 15
BinaryOp                 add_0                    2 1 10 15 16 0=0
Concat                   cat_0                    3 1 7 9 16 17 0=0
Convolution              conv_74                  1 1 17 18 0=32 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=1536
Swish                    silu_10                  1 1 18 19
Convolution              conv_75                  1 1 19 20 0=64 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=18432
Swish                    silu_11                  1 1 20 21
Convolution              conv_76                  1 1 21 22 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Swish                    silu_12                  1 1 22 23
Slice                    split_1                  1 2 23 24 25 -23300=2,32,32 1=0
Split                    splitncnn_1              1 3 25 26 27 28
Convolution              conv_77                  1 1 28 29 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=9216
Swish                    silu_13                  1 1 29 30
Convolution              conv_78                  1 1 30 31 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=9216
Swish                    silu_14                  1 1 31 32
BinaryOp                 add_1                    2 1 27 32 33 0=0
Split                    splitncnn_2              1 3 33 34 35 36
Convolution              conv_79                  1 1 36 37 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=9216
Swish                    silu_15                  1 1 37 38
Convolution              conv_80                  1 1 38 39 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=9216
Swish                    silu_16                  1 1 39 40
BinaryOp                 add_2                    2 1 35 40 41 0=0
Concat                   cat_1                    4 1 24 26 34 41 42 0=0
Convolution              conv_81                  1 1 42 43 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=8192
Swish                    silu_17                  1 1 43 44
Split                    splitncnn_3              1 2 44 45 46
Convolution              conv_82                  1 1 46 47 0=128 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=73728
Swish                    silu_18                  1 1 47 48
Convolution              conv_83                  1 1 48 49 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=16384
Swish                    silu_19                  1 1 49 50
Slice                    split_2                  1 2 50 51 52 -23300=2,64,64 1=0
Split                    splitncnn_4              1 3 52 53 54 55
Convolution              conv_84                  1 1 55 56 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_20                  1 1 56 57
Convolution              conv_85                  1 1 57 58 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_21                  1 1 58 59
BinaryOp                 add_3                    2 1 54 59 60 0=0
Split                    splitncnn_5              1 3 60 61 62 63
Convolution              conv_86                  1 1 63 64 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_22                  1 1 64 65
Convolution              conv_87                  1 1 65 66 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_23                  1 1 66 67
BinaryOp                 add_4                    2 1 62 67 68 0=0
Concat                   cat_2                    4 1 51 53 61 68 69 0=0
Convolution              conv_88                  1 1 69 70 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=32768
Swish                    silu_24                  1 1 70 71
Split                    splitncnn_6              1 2 71 72 73
Convolution              conv_89                  1 1 73 74 0=256 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=294912
Swish                    silu_25                  1 1 74 75
Convolution              conv_90                  1 1 75 76 0=256 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=65536
Swish                    silu_26                  1 1 76 77
Slice                    split_3                  1 2 77 78 79 -23300=2,128,128 1=0
Split                    splitncnn_7              1 3 79 80 81 82
Convolution              conv_91                  1 1 82 83 0=128 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish                    silu_27                  1 1 83 84
Convolution              conv_92                  1 1 84 85 0=128 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish                    silu_28                  1 1 85 86
BinaryOp                 add_5                    2 1 81 86 87 0=0
Concat                   cat_3                    3 1 78 80 87 88 0=0
Convolution              conv_93                  1 1 88 89 0=256 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=98304
Swish                    silu_29                  1 1 89 90
Convolution              conv_94                  1 1 90 91 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=32768
Swish                    silu_30                  1 1 91 92
Split                    splitncnn_8              1 2 92 93 94
Pooling                  maxpool2d_0              1 1 94 95 0=0 1=5 11=5 12=1 13=2 2=1 3=2 5=1
Split                    splitncnn_9              1 2 95 96 97
Pooling                  maxpool2d_1              1 1 97 98 0=0 1=5 11=5 12=1 13=2 2=1 3=2 5=1
Split                    splitncnn_10             1 2 98 99 100
Pooling                  maxpool2d_2              1 1 100 101 0=0 1=5 11=5 12=1 13=2 2=1 3=2 5=1
Concat                   cat_4                    4 1 93 96 99 101 102 0=0
Convolution              conv_95                  1 1 102 103 0=256 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=131072
Swish                    silu_31                  1 1 103 104
Split                    splitncnn_11             1 2 104 105 106
MemoryData               /model.10/Constant_1_output_0 0 1 107 0=0
Resize                   Resize_99                2 1 105 107 108
Concat                   cat_5                    2 1 108 72 109 0=0
Convolution              conv_96                  1 1 109 110 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=49152
Swish                    silu_32                  1 1 110 111
Slice                    split_4                  1 2 111 112 113 -23300=2,64,64 1=0
Split                    splitncnn_12             1 2 113 114 115
Convolution              conv_97                  1 1 115 116 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_33                  1 1 116 117
Convolution              conv_98                  1 1 117 118 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_34                  1 1 118 119
Concat                   cat_6                    3 1 112 114 119 120 0=0
Convolution              conv_99                  1 1 120 121 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=24576
Swish                    silu_35                  1 1 121 122
Split                    splitncnn_13             1 2 122 123 124
MemoryData               /model.13/Constant_1_output_0 0 1 125 0=0
Resize                   Resize_115               2 1 123 125 126
Concat                   cat_7                    2 1 126 45 127 0=0
Convolution              conv_100                 1 1 127 128 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=12288
Swish                    silu_36                  1 1 128 129
Slice                    split_5                  1 2 129 130 131 -23300=2,32,32 1=0
Split                    splitncnn_14             1 2 131 132 133
Convolution              conv_101                 1 1 133 134 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=9216
Swish                    silu_37                  1 1 134 135
Convolution              conv_102                 1 1 135 136 0=32 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=9216
Swish                    silu_38                  1 1 136 137
Concat                   cat_8                    3 1 130 132 137 138 0=0
Convolution              conv_103                 1 1 138 139 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=6144
Swish                    silu_39                  1 1 139 140
Split                    splitncnn_15             1 4 140 141 142 143 144
Convolution              conv_104                 1 1 144 145 0=64 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=36864
Swish                    silu_40                  1 1 145 146
Concat                   cat_9                    2 1 146 124 147 0=0
Convolution              conv_105                 1 1 147 148 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=24576
Swish                    silu_41                  1 1 148 149
Slice                    split_6                  1 2 149 150 151 -23300=2,64,64 1=0
Split                    splitncnn_16             1 2 151 152 153
Convolution              conv_106                 1 1 153 154 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_42                  1 1 154 155
Convolution              conv_107                 1 1 155 156 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_43                  1 1 156 157
Concat                   cat_10                   3 1 150 152 157 158 0=0
Convolution              conv_108                 1 1 158 159 0=128 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=24576
Swish                    silu_44                  1 1 159 160
Split                    splitncnn_17             1 4 160 161 162 163 164
Convolution              conv_109                 1 1 164 165 0=128 1=3 11=3 12=1 13=2 14=1 2=1 3=2 4=1 5=1 6=147456
Swish                    silu_45                  1 1 165 166
Concat                   cat_11                   2 1 166 106 167 0=0
Convolution              conv_110                 1 1 167 168 0=256 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=98304
Swish                    silu_46                  1 1 168 169
Slice                    split_7                  1 2 169 170 171 -23300=2,128,128 1=0
Split                    splitncnn_18             1 2 171 172 173
Convolution              conv_111                 1 1 173 174 0=128 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish                    silu_47                  1 1 174 175
Convolution              conv_112                 1 1 175 176 0=128 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish                    silu_48                  1 1 176 177
Concat                   cat_12                   3 1 170 172 177 178 0=0
Convolution              conv_113                 1 1 178 179 0=256 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=98304
Swish                    silu_49                  1 1 179 180
Split                    splitncnn_19             1 3 180 181 182 183
Convolution              conv_114                 1 1 143 184 0=27 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=15552
Swish                    silu_50                  1 1 184 185
Convolution              conv_115                 1 1 185 186 0=27 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=6561
Swish                    silu_51                  1 1 186 187
Convolution              conv_116                 1 1 187 188 0=27 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=729
Reshape                  reshape_144              1 1 188 189 0=1600 1=27
Convolution              conv_117                 1 1 163 190 0=27 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=31104
Swish                    silu_52                  1 1 190 191
Convolution              conv_118                 1 1 191 192 0=27 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=6561
Swish                    silu_53                  1 1 192 193
Convolution              conv_119                 1 1 193 194 0=27 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=729
Reshape                  reshape_145              1 1 194 195 0=400 1=27
Convolution              conv_120                 1 1 183 196 0=27 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=62208
Swish                    silu_54                  1 1 196 197
Convolution              conv_121                 1 1 197 198 0=27 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=6561
Swish                    silu_55                  1 1 198 199
Convolution              conv_122                 1 1 199 200 0=27 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=729
Reshape                  reshape_146              1 1 200 201 0=100 1=27
Concat                   cat_13                   3 1 189 195 201 202 0=1
Convolution              conv_123                 1 1 142 203 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_56                  1 1 203 204
Convolution              conv_124                 1 1 204 205 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_57                  1 1 205 206
Convolution              conv_125                 1 1 206 207 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Convolution              conv_126                 1 1 141 208 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_58                  1 1 208 209
Convolution              conv_127                 1 1 209 210 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_59                  1 1 210 211
Convolution              conv_128                 1 1 211 212 0=5 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=320
Concat                   cat_14                   2 1 207 212 213 0=0
Convolution              conv_129                 1 1 162 214 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=73728
Swish                    silu_60                  1 1 214 215
Convolution              conv_130                 1 1 215 216 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_61                  1 1 216 217
Convolution              conv_131                 1 1 217 218 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Convolution              conv_132                 1 1 161 219 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=73728
Swish                    silu_62                  1 1 219 220
Convolution              conv_133                 1 1 220 221 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_63                  1 1 221 222
Convolution              conv_134                 1 1 222 223 0=5 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=320
Concat                   cat_15                   2 1 218 223 224 0=0
Convolution              conv_135                 1 1 182 225 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish                    silu_64                  1 1 225 226
Convolution              conv_136                 1 1 226 227 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_65                  1 1 227 228
Convolution              conv_137                 1 1 228 229 0=64 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=4096
Convolution              conv_138                 1 1 181 230 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=147456
Swish                    silu_66                  1 1 230 231
Convolution              conv_139                 1 1 231 232 0=64 1=3 11=3 12=1 13=1 14=1 2=1 3=1 4=1 5=1 6=36864
Swish                    silu_67                  1 1 232 233
Convolution              conv_140                 1 1 233 234 0=5 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=1 6=320
Concat                   cat_16                   2 1 229 234 235 0=0
Reshape                  reshape_147              1 1 213 236 0=1600 1=69
Reshape                  reshape_148              1 1 224 237 0=400 1=69
Reshape                  reshape_149              1 1 235 238 0=100 1=69
Concat                   cat_17                   3 1 236 237 238 239 0=1
Slice                    split_8                  1 2 239 240 241 -23300=2,64,5 1=0
Reshape                  reshape_150              1 1 240 242 0=2100 1=16 2=4
Permute                  permute_142              1 1 242 243 0=4
Softmax                  softmax_68               1 1 243 244 0=2 1=1
Permute                  permute_143              1 1 244 245 0=5
Convolution              conv_141                 1 1 245 246 0=1 1=1 11=1 12=1 13=1 14=0 2=1 3=1 4=0 5=0 6=16
Reshape                  reshape_151              1 1 246 247 0=2100 1=4
Slice                    tensor_split_0           1 2 247 248 249 -23300=2,2,-233 1=0
MemoryData               /model.22/Constant_13_output_0 0 1 250 0=2100 1=2
BinaryOp                 sub_6                    2 1 250 248 251 0=1
Split                    splitncnn_20             1 2 251 252 253
MemoryData               /model.22/Constant_14_output_0 0 1 254 0=2100 1=2
BinaryOp                 add_7                    2 1 254 249 255 0=0
Split                    splitncnn_21             1 2 255 256 257
BinaryOp                 add_8                    2 1 252 256 258 0=0
BinaryOp                 div_9                    1 1 258 259 0=3 1=1 2=2.000000e+00
BinaryOp                 sub_10                   2 1 257 253 260 0=1
Concat                   cat_18                   2 1 259 260 261 0=0
MemoryData               /model.22/Constant_16_output_0 0 1 262 0=2100
Reshape                  reshape_152              1 1 262 263 0=2100 1=1
BinaryOp                 mul_11                   2 1 261 263 264 0=2
Sigmoid                  sigmoid_3                1 1 241 265
Reshape                  reshape_153              1 1 202 266 0=2100 1=3 2=9
Slice                    tensor_split_1           1 2 266 267 268 -23300=2,2,-233 1=1
MemoryData               /model.22/Constant_23_output_0 0 1 269 0=2100 1=2
MemoryData               /model.22/Constant_24_output_0 0 1 270 0=2100
BinaryOp                 mul_12                   1 1 267 271 0=2 1=1 2=2.000000e+00
Reshape                  reshape_154              1 1 269 272 0=2100 1=2 2=1
BinaryOp                 add_13                   2 1 271 272 273 0=0
Reshape                  reshape_155              1 1 270 274 0=2100 1=1 2=1
BinaryOp                 mul_14                   2 1 273 274 275 0=2
Sigmoid                  sigmoid_4                1 1 268 276
Concat                   cat_19                   2 1 275 276 277 0=1
Reshape                  reshape_156              1 1 277 278 0=2100 1=27
Concat                   cat_20                   3 1 264 265 278 out0 0=0

This is my code.

#include "layer.h"
#include "net.h"

#include "opencv2/opencv.hpp"
#include "opencv2/dnn.hpp"

#include <float.h>
#include <stdio.h>
#include <vector>
#include <algorithm>
#include <fstream>
#include <dirent.h>
#include <sys/stat.h>

// #define MAX_STRIDE 32 // if yolov8-p6 model modify to 64
#define STRIDE 32 //模型中的stride
const int target_size = 320;
const float prob_threshold = 0.25f;
const float nms_threshold = 0.45f;

// 9个点的颜色
const std::vector<std::vector<unsigned int>> KPS_COLORS = {
    {255, 0, 0},     // Point 1 红色
    {0, 255, 0},     // Point 2 绿色
    {0, 0, 255},     // Point 3 蓝色
    {255, 255, 0},   // Point 4 黄色
    {255, 0, 255},   // Point 5 紫色
    {0, 255, 255},   // Point 6 青色
    {128, 128, 0},   // Point 7 橄榄色
    {128, 0, 128},   // Point 8 深紫色
    {0, 128, 128}    // Point 9 蓝绿色
};

// 点之间的连接顺序
const std::vector<std::vector<unsigned int>> SKELETON = {
    {0, 1}, 
    {1, 2}, 
    {2, 3}, 
    {3, 4}, 
    {4, 5}, 
    {5, 6}, 
    {6, 7}, 
    {7, 8}
};

// 点之间的连线的颜色
const std::vector<unsigned int> LIMB_COLOR = {255, 255, 255}; // 使用白色连接线

// 5个类别方框的颜色
const std::vector<std::vector<unsigned int>> BOX_COLORS = {
    {0, 0, 255},     // 类别 0 颜色,红色
    {0, 255, 0},     // 类别 1 颜色,绿色
    {255, 0, 0},     // 类别 2 颜色,蓝色
    {255, 255, 0},   // 类别 3 颜色,黄色
    {255, 0, 255}    // 类别 4 颜色,紫色
};

struct KeyPoint {
    cv::Point2f pt;
    float confidence;
};

struct Object {
    cv::Rect rect;
    float prob;
    int label;
    std::vector<KeyPoint> keypoints;
};

// 把数值定在minval和maxval之间
inline float clamp(float val, float min_val, float max_val)
{
    return std::max(min_val, std::min(max_val, val));
}

// sigmoid函数定义
static float sigmoid(const float in)
{
    return 1.f / (1.f + expf(-1.f * in));
}

// src输入的数组,output_dst输出的数组,length输入和输出数组的长度
static float softmax(
    const float* src,
    float* output_softmax,
    int length
)
{
    // 初始化alpha为负无穷大
    float alpha = -FLT_MAX;

    // 找到输入数组中的最大值
    for (int c = 0; c < length; c++)
    {
        float score = src[c];
        if (score > alpha)
        {
            alpha = score;
        }
    }

    // 计算softmax分母
    float denominator = 0;
    for (int i = 0; i < length; ++i)
    {
        // 减去最大值以稳定计算
        output_softmax[i] = expf(src[i] - alpha);
        denominator += output_softmax[i];
    }

    // 计算softmax输出
    for (int i = 0; i < length; ++i)
    {
        output_softmax[i] /= denominator;
    }

    // 返回softmax输出中的最大值
    return *std::max_element(output_softmax, output_softmax + length);
}

// 处理模型输出:nms、框的decoder、点的decoder。原始输出out.h=36,out.w=2100
void generate_proposals(const ncnn::Mat& out, float prob_threshold, std::vector<Object>& objects, int stride = 32, int input_width = 320, int input_height = 320) {
    int num_anchors = out.w; // 每个anchor代表一个检测候选框
    int num_classes = 5; // 类别数量为5
    int num_keypoints = 9; // 每个检测结果有9个关键点

    int img_w = input_width;
    int img_h = input_height;

    // 创建新向量 new_out,其尺寸为 2100 x 36
    std::vector<float> new_out(2100 * 36);

    // 手动重排数据
    for (int i = 0; i < 36; i++) {
        for (int j = 0; j < 2100; j++) {
            new_out[j * 36 + i] = out.row(i)[j];
        }
    }

    // 创建新向量 extended_out,用来存放 2100 x 37 的结果
    std::vector<float> extended_out(2100 * 37);

    for (int i = 0; i < num_anchors; i++) {
        float softmax_input[5];
        float softmax_output[5];

        // 提取softmax输入
        for (int j = 0; j < 5; j++) {
            softmax_input[j] = new_out[i * 36 + 4 + j];
        }

        // 调用softmax函数
        float max_softmax_score = softmax(softmax_input, softmax_output, 5);

        // 插入最大softmax得分到新向量的第5列(索引4)
        for (int j = 0; j < 4; j++) {
            extended_out[i * 37 + j] = new_out[i * 36 + j]; // 复制前4列
        }

        extended_out[i * 37 + 4] = max_softmax_score; // 插入softmax最大值

        // 复制剩余的列到新的向量中
        for (int j = 5; j < 37; j++) {
            extended_out[i * 37 + j] = new_out[i * 36 + j - 1];
        }
    }

    // 使用扩展后的向量进行后续处理
    for (int i = 0; i < num_anchors; i++) {
        const float* pitem = &extended_out[i * 37];

        float cx = pitem[0];
        float cy = pitem[1];
        float width = pitem[2];
        float height = pitem[3];
        float confidence = pitem[4];

        // 框的置信度过滤
        if (confidence < prob_threshold)
            continue;

        float left = cx - width * 0.5f;
        float top = cy - height * 0.5f;
        float right = cx + width * 0.5f;
        float bottom = cy + height * 0.5f;

        left = std::max(std::min(left, static_cast<float>(img_w - 1)), 0.f);
        top = std::max(std::min(top, static_cast<float>(img_h - 1)), 0.f);
        right = std::max(std::min(right, static_cast<float>(img_w - 1)), 0.f);
        bottom = std::max(std::min(bottom, static_cast<float>(img_h - 1)), 0.f);

        float max_class_score = 0.0f;
        int label = -1;
        for (int j = 5; j <= 9; j++) {
            float class_score = pitem[j];
            if (class_score > max_class_score) {
                max_class_score = class_score;
                label = j - 5; // 类别标签从0开始
            }
        }

        Object obj;
        obj.rect.x = left;
        obj.rect.y = top;
        obj.rect.width = right - left;
        obj.rect.height = bottom - top;
        obj.prob = confidence;
        obj.label = label;

        for (int j = 0; j < num_keypoints; j++) {
            float keypoint_x = pitem[10 + j * 3]; // 更新索引,考虑到新的数组结构
            float keypoint_y = pitem[10 + j * 3 + 1];
            float keypoint_confidence = pitem[10 + j * 3 + 2];

            // keypoint_confidence = sigmoid(keypoint_confidence);

            // if (keypoint_confidence < 0.5f)
            //     continue; // 如果可见度小于0.5,则不添加该关键点

            // keypoint_x += (right + left) / 2.f;
            // keypoint_y += (bottom + top) / 2.f;

            // keypoint_x *= 2.f;
            // keypoint_y *= 2.f;

            // keypoint_x = std::max(std::min(keypoint_x, 320.f), 0.f);
            // keypoint_y = std::max(std::min(keypoint_y, 320.f), 0.f);

            // keypoint_x = std::max(std::min(keypoint_x, right), left);
            // keypoint_y = std::max(std::min(keypoint_y, bottom), top);

            KeyPoint kp;
            kp.pt = cv::Point2f(keypoint_x, keypoint_y);
            kp.confidence = keypoint_confidence;

            obj.keypoints.push_back(kp);
        }

        objects.push_back(obj);
    }
}

// nms
void nms(const std::vector<Object>& objects, std::vector<Object>& filtered_objects, float conf_threshold, float iou_threshold) {
    std::vector<cv::Rect> bboxes;
    std::vector<float> scores;
    std::vector<int> indices;

    for (const auto& obj : objects) {
        bboxes.push_back(obj.rect);
        scores.push_back(obj.prob);
    }

    cv::dnn::NMSBoxes(bboxes, scores, conf_threshold, iou_threshold, indices);

    filtered_objects.clear();
    for (int idx : indices) {
        filtered_objects.push_back(objects[idx]);
    }
}

// 检测函数
static int detect_yolov8(const cv::Mat& bgr, std::vector<Object>& objects) {
    ncnn::Net mymodel;

    mymodel.opt.use_vulkan_compute = true;

    std::string paramPath = "/home/eze2szh/third_party/ncnn/mypro/best.ncnn.param";
    std::string binPath = "/home/eze2szh/third_party/ncnn/mypro/best.ncnn.bin";
    std::cout << "Loading model from: " << paramPath << std::endl << binPath << std::endl;

    if (mymodel.load_param(paramPath.c_str())) {
        std::cerr << "Failed to load param file" << std::endl;
        return -1;
    }
    if (mymodel.load_model(binPath.c_str())) {
        std::cerr << "Failed to load model file" << std::endl;
        return -1;
    }

    //letterbox 的前处理
    int img_w = bgr.cols;
    int img_h = bgr.rows;

    int w = img_w;
    int h = img_h;
    float scale = 1.f;
    // if (w > h) {
    //     scale = static_cast<float>(target_size) / w;
    //     w = target_size;
    //     h = h * scale;
    // } else {
    //     scale = static_cast<float>(target_size) / h;
    //     h = target_size;
    //     w = w * scale;
    // }

    ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data, ncnn::Mat::PIXEL_BGR2RGB, img_w, img_h, w, h);

    // int wpad = (w + STRIDE - 1) / STRIDE * STRIDE - w;
    // int hpad = (h + STRIDE - 1) / STRIDE * STRIDE - h;
    // ncnn::Mat in_pad;
    // ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f);

    const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
    // in_pad.substract_mean_normalize(0, norm_vals);
    in.substract_mean_normalize(0, norm_vals);

    double t1, t2;
    t1 = static_cast<double>(cv::getTickCount());

    ncnn::Extractor ex = mymodel.create_extractor();
    // ex.input("images", in_pad);
    ex.input("in0", in);

    std::vector<Object> proposals;

    ncnn::Mat out;
    if (ex.extract("out0", out) != 0) {  // 使用正确的输出层名称 "output0"
        std::cerr << "Failed to extract output0" << std::endl;
        return -1;
    }

    std::cout << "Output layer shape: " 
              << out.w << "x" << out.h << "x" << out.c << std::endl;

    std::ofstream log_file("/home/eze2szh/third_party/ncnn/myoutput/log.txt");

    if (!log_file.is_open()) {
        std::cerr << "Failed to open log file for writing" << std::endl;
        return -1;
    }

    log_file << "Detection Output - 2100 Groups, Each with 36 Elements:\n\n";

    // 输出2100组,每组36个元素
    for (int i = 0; i < out.w; ++i) {
    // 每组数据前添加描述性标题
    log_file << "Group " << i + 1 << ":\n";

    for (int j = 0; j < out.h; ++j) {
        log_file << out.channel(0)[j * out.w + i] << " ";
    }

    log_file << std::endl; // 每组后换行
    log_file << "----------------------------------------\n"; // 分割线,分隔不同的组
    }

    log_file.close();

    generate_proposals(out, prob_threshold, proposals);

    t2 = static_cast<double>(cv::getTickCount());
    std::cout << ">> [HumanPose] inference & postprocess time cost: " << (t2 - t1)*1000 / cv::getTickFrequency() << " ms." << std::endl;

    std::vector<Object> final_objects;
    nms(proposals, final_objects, prob_threshold, nms_threshold);

    std::cout << "Number of objects detected: " << final_objects.size() << std::endl;

    // 将NMS处理后的对象存回原对象数组
    objects = final_objects;

    return 0;
}

// 画出框图和关键点
void draw_objects(const cv::Mat& bgr, cv::Mat& res, const std::vector<Object>& objects, 
                  const std::vector<std::vector<unsigned int>>& SKELETON, 
                  const std::vector<std::vector<unsigned int>>& KPS_COLORS, 
                  const std::vector<unsigned int>& LIMB_COLOR, 
                  const std::vector<std::vector<unsigned int>>& BOX_COLORS, 
                  const char* class_names[]) 
{
    res = bgr.clone();

    for (const auto& obj : objects) {
        // 绘制边界框
        const auto& box_color = BOX_COLORS[obj.label];
        cv::Scalar box_color_cv(box_color[0], box_color[1], box_color[2]);
        cv::rectangle(res, obj.rect, box_color_cv, 2);

        // 绘制类别标签和置信度
        std::string label = class_names[obj.label];
        std::string text = label + " (" + std::to_string(obj.prob) + ")";
        int baseLine;
        cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
        cv::rectangle(res, cv::Point(obj.rect.x, obj.rect.y - label_size.height),
                      cv::Point(obj.rect.x + label_size.width, obj.rect.y + baseLine), box_color_cv, cv::FILLED);
        cv::putText(res, text, cv::Point(obj.rect.x, obj.rect.y), cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(255, 255, 255), 1);

        std::cout << "Drawing box: " << text << ", rect=(" << obj.rect.x << ", " << obj.rect.y 
                  << ", " << obj.rect.width << ", " << obj.rect.height << ")" << std::endl;

        // 绘制关键点
        for (size_t j = 0; j < obj.keypoints.size(); j++) {
            const auto& kp = obj.keypoints[j];

            // //不过滤置信度,打出所有点
            // const auto& kp_color = KPS_COLORS[j];
            // cv::Scalar kp_color_cv(kp_color[0], kp_color[1], kp_color[2]);
            // cv::circle(res, kp.pt, 3, kp_color_cv, -1);

            // std::cout << "  Drawing KeyPoint " << j << ": (" << kp.pt.x << ", " << kp.pt.y 
            //             << "), confidence=" << kp.confidence << std::endl;

            if (kp.confidence > 0.5f) {  // 如果置信度大于0.5,绘制关键点
                const auto& kp_color = KPS_COLORS[j];
                cv::Scalar kp_color_cv(kp_color[0], kp_color[1], kp_color[2]);
                cv::circle(res, kp.pt, 3, kp_color_cv, -1);

                std::cout << "  Drawing KeyPoint " << j << ": (" << kp.pt.x << ", " << kp.pt.y 
                          << "), confidence=" << kp.confidence << std::endl;
            }
        }

        // 绘制骨架
        for (const auto& limb : SKELETON) {
            unsigned int idx0 = limb[0];
            unsigned int idx1 = limb[1];
            if (idx0 < obj.keypoints.size() && idx1 < obj.keypoints.size()) {
                const KeyPoint& kp0 = obj.keypoints[idx0];
                const KeyPoint& kp1 = obj.keypoints[idx1];
                //更改全部画出线,正常>0.5f
                if (kp0.confidence > 0.5f && kp1.confidence > 0.5f) {
                    cv::line(res, kp0.pt, kp1.pt, cv::Scalar(LIMB_COLOR[0], LIMB_COLOR[1], LIMB_COLOR[2]), 2);

                    std::cout << "  Drawing point connection: (" << kp0.pt.x << ", " << kp0.pt.y 
                              << ") -> (" << kp1.pt.x << ", " << kp1.pt.y << ")" << std::endl;
                }
            }
        }
    }
}

int main(int argc, char** argv)
{
    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s [imagepath]\n", argv[0]);
        return -1;
    }

    const char* imagepath = argv[1];

    // 读取指定的图像文件
    cv::Mat m = cv::imread(imagepath, 1);
    if (m.empty())
    {
        std::cerr << "cv::imread " << imagepath << " failed" << std::endl;
        return -1;
    }

    // 输出路径
    std::string output_folder = "/home/eze2szh/third_party/ncnn/myoutput";
    mkdir(output_folder.c_str(), 0777);  // 创建输出目录

    std::vector<Object> objects;
    if (detect_yolov8(m, objects) != 0) {
        std::cerr << "Detection failed for image: " << imagepath << std::endl;
        return -1;
    }

    // 创建保存结果的图像
    cv::Mat res;
    const char* class_names[] = {"Car", "Square pillar", "Round pillar", "Wall", "Others"};
    draw_objects(m, res, objects, SKELETON, KPS_COLORS, LIMB_COLOR, BOX_COLORS, class_names);

    // 获取输入图像的文件名
    std::string filename = imagepath;
    size_t pos = filename.find_last_of("/\\");
    if (pos != std::string::npos)
    {
        filename = filename.substr(pos + 1);
    }

    // 构建输出图像的完整路径
    std::string output_path = output_folder + "/" + filename;
    if (!cv::imwrite(output_path, res)) {
        std::cerr << "Failed to write output image to " << output_path << std::endl;
    } else {
        std::cout << "Result saved to " << output_path << std::endl;
    }

    // 可选:显示处理结果(如果只处理单个图片,可以保留)
    // cv::imshow("result", res);
    // cv::waitKey(0);

    return 0;
}
Mauerrr commented 3 months ago

Problem solved!I love teacher nihui. Teacher nihui yyds!!!! https://github.com/Tencent/ncnn/issues/1298#issuecomment-2268325330