Closed J-Zenk closed 4 years ago
Hm I've never seen that "Error message: Protobuf parsing failed" before. That syscall seems to correspond to tgkill
so I think that's just the runner trying to kill itself after the failed protobuf parse.
I cloned a fresh version of the repo (on centos7) and couldn't reproduce the issue, which is going to make this a bit harder to debug. (Just as a quick sanity-check, can you run md5sum googlenet_quantized.onnx
? It should be e1360471e07e0810ee3696eb44e66c57
).
The only significant difference I can see at the moment is that your version of pk
/ spike
seems to be a bit newer than what I've been using (in particular mine doesn't print "bbl loader").
Can you try building spike
via the following:
Clone https://github.com/ucb-bar/esp-tools (using the default master branch, latest commit should be dcb6012f7
)
Apply the syscall patches inside the riscv-pk folder
Inside riscv-isa-sim, checkout commit 506d83f6a8d
on the master branch
Build spike and try using that built version of spike to run.
Hopefully this works, if not we can do some more digging (the fact that protobuf parsing fails seems to indicate that it's a pretty fundamental issue unrelated to the gemmini extension).
As for the calibration script, it's written to assume that the --dataset_path
directory contains a bunch of folders labeled test_data_set_1
, test_data_set_2
, etc. each of which contains an input_0.pb
, input_1.pb
, and so on. I haven't checked what the folder structure of the arcface
model looks like, but the error seems to indicate that the folder structure might not match what's described above.
Also do note that the imagenet_runner
was hard-coded for using fixed 224x224 pngs as input, so you'll probably have to create your own runner for arcface
, especially since it seems that the model can accept variable-sized images (so you'll have to manually set the input tensor dimensions – you can look at the upstream onnxruntime api docs for more info).
Also last point for the mxnet derived models is that I've found them to be very finnicky and sensitive to quantization. You might want to first get the non-quantized (i.e. floating point) model running first (for which you'll likely need to write your own runner), at which point you can start playing around with quantization. The accuracy for ResNet models is also a bit poor at the moment because we're limited to power-of-2 scale factors. I've also never really tried the quantization/calibration scripts with non-imagenet models (although they should presumably work since they were modified from microsoft's upstream implementation), so I'd be interested to hear how it turns out!
Oh another thing you can try is to run it with qemu. If you download qemu and enable riscv-user space emulation via ./configure --disable-system --target-list=riscv64-linux-user
, you should get a qemu-riscv64
binary that you can use instead of spike (just substitute spike --extension=gemmini pk
for qemu-riscv64
). Of course if you use qemu you'll have to use -x 0
(don't use gemmini instructions, i.e. emulate them via cpu only) but this should help us debug where the issue lies.
Thanks a lot. The md5 value does not match. So I re-download the onnx file with a proxy, and it works.
And I messed the structure of the arcface folder so I got such error. Now inside the folder I have a onnx file and 3 folders named test_data_set_0
, test_data_set_1
, test_data_set_2
, each folder contains a input_0.pb
and a output_0.pb
. I tried to quantize the network again, it show as below. What causes this error?
Num cases 3, num inputs for each cases 1
2020-07-16 17:11:44.181980728 [E:onnxruntime:, sequential_executor.cc:281 Execute] Non-zero status code returned while running BatchNormalization node. Name:'bn0' Status Message: Invalid input scale: NumDimensions() != 3
Traceback (most recent call last):
File "calibrate.py", line 379, in <module>
main()
File "calibrate.py", line 358, in main
inputs, calib_mode)
File "calibrate.py", line 118, in get_intermediate_outputs
for j in range(num_input_names)}) for i in range(num_inputs)
File "calibrate.py", line 118, in <listcomp>
for j in range(num_input_names)}) for i in range(num_inputs)
File "/usr/local/lib64/python3.6/site-packages/onnxruntime/capi/session.py", line 111, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running BatchNormalization node. Name:'bn0' Status Message: Invalid input scale: NumDimensions() != 3
Seems like this is relevant; please read through it (and any associated linked issues therein) and try the suggested fix?
I have tried the fix, and it gives me another error. It seems that the problem is caused by the network, too. Because I also tried googlenet in onnx model zoo and succeed. So maybe i will build a new network and convert it to onnx format and then quantize it. If I do so, Is there anything I should pay attention to, so that I can avoid some strange problems like what happens during the arcface's conversion? Thank you very much.
and it gives me another error.
What error did it give? I do recall that mxnet-based models had some batch-normalization version mismatch thing after quantization because they use some attribute that was removed in opsets newer than opset version 7 (and quantized operators require opset 10 or above). If that was the case you could try using this tool to convert it to a newer opset before running the quantization, but as you suggested it's probably a better idea to just export a new model from PyTorch with the latest opset.
Is there anything I should pay attention to, so that I can avoid some strange problems like what happens during the arcface's conversion
I think just making sure to export the latest opset version (anything >= version 10 should be fine) is sufficient. Let me know if you run into any issues though.
What error did it give?
This is the error. It seems that it is not caused by opset verison, because the origin network is opset version 8 and have the same error as the opset version 9 one which is converted by convert_to_opset9.py
.
Num cases 3, num inputs for each cases 1
{'conv0': (-2.7700576782226562, 1.8512709140777588), '_mulscalar0': (-1.0126389265060425, -0.9754691123962402), 'stage1_unit1_conv1': (-1.5255255699157715, 1.1363569498062134), 'stage1_unit1_bn1': (-2.677807331085205, 2.351346015930176), 'stage1_unit1_conv2': (-1.0140470266342163, 0.8351755142211914), 'stage1_unit1_relu1': (-0.022769352421164513, 1.9310780763626099), 'stage1_unit1_conv1sc': (-0.3902704417705536, 0.36174634099006653), 'relu0': (-0.5688446164131165, 1.088736653327942), 'stage1_unit2_conv1': (-0.9079145789146423, 1.004768967628479), 'stage1_unit2_bn1': (-0.7862108945846558, 0.780624508857727), 'stage1_unit2_conv2': (-1.0263770818710327, 0.8691070079803467), 'stage1_unit2_relu1': (-0.7567529678344727, 1.049529790878296), 'stage1_unit3_conv1': (-0.5032097697257996, 0.7926806211471558), 'stage1_unit3_bn1': (-1.4762139320373535, 1.992246389389038), 'stage1_unit3_conv2': (-0.4169164001941681, 0.9362981915473938), 'stage1_unit3_relu1': (3.6227927324716802e-09, 2.7148284912109375), 'stage2_unit1_conv1': (-0.580380916595459, 1.6314289569854736), 'stage2_unit1_bn1': (-1.9812438488006592, 4.54368257522583), 'stage2_unit1_conv2': (-1.1352336406707764, 1.2988908290863037), 'stage2_unit1_relu1': (-0.011583560146391392, 5.083901882171631), 'stage2_unit1_conv1sc': (-0.9727423191070557, 1.723375916481018), '_plus2': (-1.670774221420288, 10.384163856506348), 'stage2_unit2_conv1': (-0.2573625147342682, 0.13485988974571228), 'stage2_unit2_bn1': (-0.6998764872550964, 0.27929526567459106), 'stage2_unit2_conv2': (-0.24395932257175446, 0.2444741427898407), 'stage2_unit2_relu1': (3.544041060621339e-09, 1.6052155494689941), 'stage2_unit3_conv1': (-0.33123549818992615, 0.41484981775283813), 'stage2_unit3_bn1': (-0.8922042846679688, 0.544468343257904), 'stage2_unit3_conv2': (-0.5210393071174622, 0.2614585757255554), 'stage2_unit3_relu1': (-0.004366954322904348, 1.0961897373199463), 'stage2_unit4_conv1': (-0.21901430189609528, 0.6279692053794861), 'stage2_unit4_bn1': (-0.9523730278015137, 0.4922447204589844), 'stage2_unit4_conv2': (-0.24297407269477844, 0.27929195761680603), 'stage2_unit4_relu1': (-0.004497249145060778, 0.7889650464057922), 'stage2_unit5_conv1': (-0.397145539522171, 0.7217621207237244), 'stage2_unit5_bn1': (-0.6942195296287537, 0.5839004516601562), 'stage2_unit5_conv2': (-0.21408230066299438, 0.4311734437942505), 'stage2_unit5_relu1': (-0.026107193902134895, 0.5513373613357544), 'stage2_unit6_conv1': (-0.3651759624481201, 0.6158580780029297), 'stage2_unit6_bn1': (-0.5249855518341064, 0.7201350331306458), 'stage2_unit6_conv2': (-0.26098182797431946, 0.27301323413848877), 'stage2_unit6_relu1': (-0.0027470411732792854, 0.5985389947891235), 'stage2_unit7_conv1': (-0.27477845549583435, 0.5691264271736145), 'stage2_unit7_bn1': (-0.6672435998916626, 0.7075446248054504), 'stage2_unit7_conv2': (-0.19847100973129272, 0.2942486107349396), 'stage2_unit7_relu1': (-0.0025474284775555134, 0.44384798407554626), 'stage2_unit8_conv1': (-0.295512318611145, 0.42267584800720215), 'stage2_unit8_bn1': (-0.5995339155197144, 0.6014330387115479), 'stage2_unit8_conv2': (-0.166782408952713, 0.25694501399993896), 'stage2_unit8_relu1': (-0.0017675124108791351, 0.33467692136764526), 'stage2_unit9_conv1': (-0.41656625270843506, 0.37687447667121887), 'stage2_unit9_bn1': (-0.6179551482200623, 0.39605095982551575), 'stage2_unit9_conv2': (-0.2933601438999176, 0.25816017389297485), 'stage2_unit9_relu1': (2.3207581989481696e-07, 0.5152771472930908), 'stage2_unit10_conv1': (-0.4387154281139374, 0.45474547147750854), 'stage2_unit10_bn1': (-0.6467996835708618, 0.5094567537307739), 'stage2_unit10_conv2': (-0.21270819008350372, 0.2974368631839752), 'stage2_unit10_relu1': (2.644956111907959e-07, 0.35188809037208557), 'stage2_unit11_conv1': (-0.35708561539649963, 0.3530902862548828), 'stage2_unit11_bn1': (-0.5718078017234802, 0.4663313329219818), 'stage2_unit11_conv2': (-0.20177653431892395, 0.26275286078453064), 'stage2_unit11_relu1': (9.807602197042797e-08, 0.3139716386795044), 'stage2_unit12_conv1': (-0.2546583414077759, 0.31906643509864807), 'stage2_unit12_bn1': (-0.4647640883922577, 0.44840580224990845), 'stage2_unit12_conv2': (-0.18789871037006378, 0.14361216127872467), 'stage2_unit12_relu1': (9.652539034732399e-08, 0.3494085967540741), 'stage2_unit13_conv1': (-0.41568630933761597, 0.395785927772522), 'stage2_unit13_bn1': (-0.6449421048164368, 0.5393414497375488), 'stage2_unit13_conv2': (-0.3820760250091553, 0.6321218609809875), 'stage2_unit13_relu1': (1.5871721714688647e-08, 0.4499181807041168), 'stage3_unit1_conv1': (-0.5726640820503235, 0.827939510345459), 'stage3_unit1_bn1': (-0.7210996150970459, 0.8041184544563293), 'stage3_unit1_conv2': (-0.4858264923095703, 0.3447681665420532), 'stage3_unit1_relu1': (-0.018802566453814507, 0.411818265914917), 'stage3_unit1_conv1sc': (-1.9470412731170654, 1.312037467956543), '_plus15': (-4.541324615478516, 2.1670632362365723), 'stage3_unit2_conv1': (-0.22424371540546417, 0.14657652378082275), 'stage3_unit2_bn1': (-0.40753498673439026, 0.3977765142917633), 'stage3_unit2_conv2': (-0.17768806219100952, 0.2587220072746277), 'stage3_unit2_relu1': (2.494394664778743e-12, 0.39966338872909546), 'stage3_unit3_conv1': (-0.1841924488544464, 0.17269444465637207), 'stage3_unit3_bn1': (-0.3992823660373688, 0.3318554759025574), 'stage3_unit3_conv2': (-0.14911897480487823, 0.15426433086395264), 'stage3_unit3_relu1': (4.030409428423809e-08, 0.30615344643592834), 'stage3_unit4_conv1': (-0.20091697573661804, 0.2796117663383484), 'stage3_unit4_bn1': (-0.5242156386375427, 0.38267460465431213), 'stage3_unit4_conv2': (-0.15447556972503662, 0.17338211834430695), 'stage3_unit4_relu1': (-0.001559401280246675, 0.395333856344223), 'stage3_unit5_conv1': (-0.3620983958244324, 0.2256709337234497), 'stage3_unit5_bn1': (-0.47639572620391846, 0.5078409910202026), 'stage3_unit5_conv2': (-0.2632167935371399, 0.22298793494701385), 'stage3_unit5_relu1': (-0.004165561404079199, 0.34686315059661865), 'stage3_unit6_conv1': (-0.519700288772583, 0.40025827288627625), 'stage3_unit6_bn1': (-0.6167299747467041, 0.5394785404205322), 'stage3_unit6_conv2': (-0.2962507903575897, 0.286422997713089), 'stage3_unit6_relu1': (-0.003223944455385208, 0.49111926555633545), 'stage3_unit7_conv1': (-0.2245730757713318, 0.29485976696014404), 'stage3_unit7_bn1': (-0.5814918279647827, 0.7422151565551758), 'stage3_unit7_conv2': (-0.1640111804008484, 0.1288129687309265), 'stage3_unit7_relu1': (1.3635625831578957e-12, 0.36254796385765076), 'stage3_unit8_conv1': (-0.3018781542778015, 0.27658092975616455), 'stage3_unit8_bn1': (-0.5469644665718079, 0.5125359296798706), 'stage3_unit8_conv2': (-0.2527916729450226, 0.2789697051048279), 'stage3_unit8_relu1': (3.091239070274199e-11, 0.502379834651947), 'stage3_unit9_conv1': (-0.545325756072998, 0.24950751662254333), 'stage3_unit9_bn1': (-0.6723271012306213, 0.7246034741401672), 'stage3_unit9_conv2': (-0.15685269236564636, 0.11693911254405975), 'stage3_unit9_relu1': (-0.01037586573511362, 0.3197324275970459), 'stage3_unit10_conv1': (-0.5106210112571716, 0.30369535088539124), 'stage3_unit10_bn1': (-0.6339330077171326, 0.6078217625617981), 'stage3_unit10_conv2': (-0.20839078724384308, 0.21890105307102203), 'stage3_unit10_relu1': (-0.0024764223489910364, 0.3341023027896881), 'stage3_unit11_conv1': (-0.435392826795578, 0.3541238307952881), 'stage3_unit11_bn1': (-0.5874464511871338, 0.5702937245368958), 'stage3_unit11_conv2': (-0.301033616065979, 0.32429423928260803), 'stage3_unit11_relu1': (-0.0008741768542677164, 0.33039283752441406), 'stage3_unit12_conv1': (-0.37518972158432007, 0.4138906002044678), 'stage3_unit12_bn1': (-0.5808767080307007, 0.6330219507217407), 'stage3_unit12_conv2': (-0.3561904728412628, 0.43344947695732117), 'stage3_unit12_relu1': (-0.0017991125350818038, 0.4858672618865967), 'stage3_unit13_conv1': (-0.3747340142726898, 0.3277401030063629), 'stage3_unit13_bn1': (-0.5056686997413635, 0.8359493613243103), 'stage3_unit13_conv2': (-0.16904059052467346, 0.1480470448732376), 'stage3_unit13_relu1': (-0.011989030055701733, 0.38342130184173584), 'stage3_unit14_conv1': (-0.2563401460647583, 0.2480783462524414), 'stage3_unit14_bn1': (-0.390505313873291, 0.8186060786247253), 'stage3_unit14_conv2': (-0.10976225882768631, 0.09959360957145691), 'stage3_unit14_relu1': (8.161140385709587e-08, 0.321338951587677), 'stage3_unit15_conv1': (-0.25992459058761597, 0.33974120020866394), 'stage3_unit15_bn1': (-0.401766836643219, 0.5625117421150208), 'stage3_unit15_conv2': (-0.16411973536014557, 0.20697088539600372), 'stage3_unit15_relu1': (-0.033499810844659805, 0.33262568712234497), 'stage3_unit16_conv1': (-0.12597525119781494, 0.16341085731983185), 'stage3_unit16_bn1': (-0.642011821269989, 0.7743183374404907), 'stage3_unit16_conv2': (-0.15693651139736176, 0.11338046938180923), 'stage3_unit16_relu1': (9.637012077234886e-09, 0.3087193965911865), 'stage3_unit17_conv1': (-0.35468026995658875, 0.3821070194244385), 'stage3_unit17_bn1': (-0.6037321090698242, 0.6920698285102844), 'stage3_unit17_conv2': (-0.17673300206661224, 0.26448488235473633), 'stage3_unit17_relu1': (1.4634181866313156e-07, 0.464324027299881), 'stage3_unit18_conv1': (-0.3813590407371521, 0.2828254997730255), 'stage3_unit18_bn1': (-0.5927966833114624, 0.6548709273338318), 'stage3_unit18_conv2': (-0.17128826677799225, 0.1501263678073883), 'stage3_unit18_relu1': (-0.007886271923780441, 0.44737768173217773), 'stage3_unit19_conv1': (-0.22415035963058472, 0.2810840904712677), 'stage3_unit19_bn1': (-0.7576963901519775, 1.069539189338684), 'stage3_unit19_conv2': (-0.2111537754535675, 0.1566932052373886), 'stage3_unit19_relu1': (-0.003581683151423931, 0.5865971446037292), 'stage3_unit20_conv1': (-0.32629501819610596, 0.30639371275901794), 'stage3_unit20_bn1': (-0.7453984022140503, 0.7082696557044983), 'stage3_unit20_conv2': (-0.16008047759532928, 0.11353600025177002), 'stage3_unit20_relu1': (1.0354535788792418e-06, 0.45895808935165405), 'stage3_unit21_conv1': (-0.1636548638343811, 0.21434618532657623), 'stage3_unit21_bn1': (-0.6586788892745972, 0.8185128569602966), 'stage3_unit21_conv2': (-0.19033634662628174, 0.19410300254821777), 'stage3_unit21_relu1': (3.796392367139134e-10, 0.2997061014175415), 'stage3_unit22_conv1': (-0.12159842997789383, 0.1409245729446411), 'stage3_unit22_bn1': (-0.5207512974739075, 0.674064576625824), 'stage3_unit22_conv2': (-0.12496834993362427, 0.20494888722896576), 'stage3_unit22_relu1': (7.566335114006506e-08, 0.25298017263412476), 'stage3_unit23_conv1': (-0.11415666341781616, 0.14841502904891968), 'stage3_unit23_bn1': (-0.5651000142097473, 0.7120834589004517), 'stage3_unit23_conv2': (-0.09524092078208923, 0.0823959931731224), 'stage3_unit23_relu1': (8.248670724242402e-08, 0.20683155953884125), 'stage3_unit24_conv1': (-0.21861375868320465, 0.21948696672916412), 'stage3_unit24_bn1': (-0.4840283691883087, 0.8351663947105408), 'stage3_unit24_conv2': (-0.15638257563114166, 0.09795545041561127), 'stage3_unit24_relu1': (3.829651404885226e-07, 0.3478612005710602), 'stage3_unit25_conv1': (-0.16192227602005005, 0.17361211776733398), 'stage3_unit25_bn1': (-0.5178247690200806, 0.5179949998855591), 'stage3_unit25_conv2': (-0.07853621244430542, 0.09793967008590698), 'stage3_unit25_relu1': (1.5551707122085645e-07, 0.20840789377689362), 'stage3_unit26_conv1': (-0.20955918729305267, 0.1912509649991989), 'stage3_unit26_bn1': (-0.5183373689651489, 0.5781286358833313), 'stage3_unit26_conv2': (-0.2181871086359024, 0.08206260949373245), 'stage3_unit26_relu1': (1.273264871315405e-07, 0.260122686624527), 'stage3_unit27_conv1': (-0.23040422797203064, 0.22308039665222168), 'stage3_unit27_bn1': (-0.47349828481674194, 0.4709331691265106), 'stage3_unit27_conv2': (-0.06822191178798676, 0.06991805881261826), 'stage3_unit27_relu1': (3.3852268188638845e-07, 0.2227790206670761), 'stage3_unit28_conv1': (-0.287468820810318, 0.35272350907325745), 'stage3_unit28_bn1': (-0.6915250420570374, 0.4747917354106903), 'stage3_unit28_conv2': (-0.07850893586874008, 0.2568800151348114), 'stage3_unit28_relu1': (3.09994362623911e-07, 0.36390799283981323), 'stage3_unit29_conv1': (-0.2683931887149811, 0.29413872957229614), 'stage3_unit29_bn1': (-0.609352707862854, 0.6880134344100952), 'stage3_unit29_conv2': (-0.26083219051361084, 0.11767230927944183), 'stage3_unit29_relu1': (-0.019393207505345345, 0.22927677631378174), 'stage3_unit30_conv1': (-0.22444383800029755, 0.24487926065921783), 'stage3_unit30_bn1': (-0.45070981979370117, 0.5351380109786987), 'stage3_unit30_conv2': (-0.09777919948101044, 0.07721813768148422), 'stage3_unit30_relu1': (-0.003372140461578965, 0.26520460844039917), 'stage4_unit1_conv1': (-0.3942401111125946, 0.49705347418785095), 'stage4_unit1_bn1': (-0.7365911602973938, 0.5604126453399658), 'stage4_unit1_conv2': (-0.10353611409664154, 0.09256849437952042), 'stage4_unit1_relu1': (6.8581509360399195e-09, 0.3770582675933838), 'stage4_unit1_conv1sc': (-1.0546667575836182, 1.5163278579711914), '_plus45': (-3.2545576095581055, 4.6348419189453125), 'stage4_unit2_conv1': (-0.13596974313259125, 0.13410010933876038), 'stage4_unit2_bn1': (-0.20413748919963837, 0.25919032096862793), 'stage4_unit2_conv2': (-0.034581054002046585, 0.03934990242123604), 'stage4_unit2_relu1': (-0.0, 0.14119192957878113), 'stage4_unit3_conv1': (0.0, 0.0), 'stage4_unit3_bn1': (-7.121272678836931e-39, 6.233115699163219e-39), 'stage4_unit3_conv2': (0.0, 0.0), 'stage4_unit3_relu1': (-0.0, 1.975722934716239e-39)}
{'conv0': [0, 0.021811477781280758], '_mulscalar0': [0, 0.007973534854378288], 'stage1_unit1_conv1': [0, 0.012012012361541508], 'stage1_unit1_bn1': [0, 0.021085097095159096], 'stage1_unit1_conv2': [0, 0.007984622256962334], 'stage1_unit1_relu1': [0, 0.015205339183957558], 'stage1_unit1_conv1sc': [0, 0.003072995604492548], 'relu0': [0, 0.008572729553763323], 'stage1_unit2_conv1': [0, 0.007911566674239992], 'stage1_unit2_bn1': [0, 0.00619063696523351], 'stage1_unit2_conv2': [0, 0.008081709306071124], 'stage1_unit2_relu1': [0, 0.008264014101403904], 'stage1_unit3_conv1': [0, 0.006241579694072093], 'stage1_unit3_bn1': [0, 0.015686979444008174], 'stage1_unit3_conv2': [0, 0.007372426705097589], 'stage1_unit3_relu1': [0, 0.021376602292999508], 'stage2_unit1_conv1': [0, 0.012845897299098217], 'stage2_unit1_bn1': [0, 0.03577702815138449], 'stage2_unit1_conv2': [0, 0.010227486843199242], 'stage2_unit1_relu1': [0, 0.040030723481666385], 'stage2_unit1_conv1sc': [0, 0.013569889106149749], '_plus2': [0, 0.08176506973627046], 'stage2_unit2_conv1': [0, 0.002026476493970616], 'stage2_unit2_bn1': [0, 0.005510838482323594], 'stage2_unit2_conv2': [0, 0.0019249932503137062], 'stage2_unit2_relu1': [0, 0.01263949251550389], 'stage2_unit3_conv1': [0, 0.0032665339980538434], 'stage2_unit3_bn1': [0, 0.007025230587936762], 'stage2_unit3_conv2': [0, 0.004102671709586316], 'stage2_unit3_relu1': [0, 0.008631415254487766], 'stage2_unit4_conv1': [0, 0.004944639412436899], 'stage2_unit4_bn1': [0, 0.007499000218909557], 'stage2_unit4_conv2': [0, 0.0021991492725732757], 'stage2_unit4_relu1': [0, 0.006212323200045608], 'stage2_unit5_conv1': [0, 0.005683166304911215], 'stage2_unit5_bn1': [0, 0.005466295508887824], 'stage2_unit5_conv2': [0, 0.0033950664865689015], 'stage2_unit5_relu1': [0, 0.0043412390656358615], 'stage2_unit6_conv1': [0, 0.004849276204747478], 'stage2_unit6_bn1': [0, 0.005670354591579888], 'stage2_unit6_conv2': [0, 0.002149710505027471], 'stage2_unit6_relu1': [0, 0.0047129054707805], 'stage2_unit7_conv1': [0, 0.004481310450185941], 'stage2_unit7_bn1': [0, 0.005571217518153153], 'stage2_unit7_conv2': [0, 0.002316918194763304], 'stage2_unit7_relu1': [0, 0.0034948660163428838], 'stage2_unit8_conv1': [0, 0.003328156283521277], 'stage2_unit8_bn1': [0, 0.004735693218201164], 'stage2_unit8_conv2': [0, 0.0020231890866136927], 'stage2_unit8_relu1': [0, 0.0026352513493515376], 'stage2_unit9_conv1': [0, 0.0032800492339246853], 'stage2_unit9_bn1': [0, 0.004865788568661908], 'stage2_unit9_conv2': [0, 0.002309922392912737], 'stage2_unit9_relu1': [0, 0.0040573003723865415], 'stage2_unit10_conv1': [0, 0.00358067300375991], 'stage2_unit10_bn1': [0, 0.005092910894258755], 'stage2_unit10_conv2': [0, 0.00234202254475571], 'stage2_unit10_relu1': [0, 0.002770772365134532], 'stage2_unit11_conv1': [0, 0.0028116977590275562], 'stage2_unit11_bn1': [0, 0.004502423635617954], 'stage2_unit11_conv2': [0, 0.0020689201636577216], 'stage2_unit11_relu1': [0, 0.0024722176273976725], 'stage2_unit12_conv1': [0, 0.0025123341346350242], 'stage2_unit12_bn1': [0, 0.0036595597511201393], 'stage2_unit12_conv2': [0, 0.001479517404488691], 'stage2_unit12_relu1': [0, 0.002751248793339166], 'stage2_unit13_conv1': [0, 0.00327312054596548], 'stage2_unit13_bn1': [0, 0.005078284289893203], 'stage2_unit13_conv2': [0, 0.004977337488039272], 'stage2_unit13_relu1': [0, 0.003542662840189896], 'stage3_unit1_conv1': [0, 0.0065192087428776295], 'stage3_unit1_bn1': [0, 0.006331641373671885], 'stage3_unit1_conv2': [0, 0.0038254054512564593], 'stage3_unit1_relu1': [0, 0.0032426635111410785], 'stage3_unit1_conv1sc': [0, 0.015331033646591066], '_plus15': [0, 0.03575846153920091], 'stage3_unit2_conv1': [0, 0.0017656985464997179], 'stage3_unit2_bn1': [0, 0.0032089369034203957], 'stage3_unit2_conv2': [0, 0.0020371811596427377], 'stage3_unit2_relu1': [0, 0.0031469558167645313], 'stage3_unit3_conv1': [0, 0.001450334242948397], 'stage3_unit3_bn1': [0, 0.0031439556380895178], 'stage3_unit3_conv2': [0, 0.0012146797705823043], 'stage3_unit3_relu1': [0, 0.0024106570585506167], 'stage3_unit4_conv1': [0, 0.002201667451483058], 'stage3_unit4_bn1': [0, 0.004127682193996399], 'stage3_unit4_conv2': [0, 0.0013652135302701335], 'stage3_unit4_relu1': [0, 0.0031128650105844333], 'stage3_unit5_conv1': [0, 0.0028511684710585228], 'stage3_unit5_bn1': [0, 0.003998747960788997], 'stage3_unit5_conv2': [0, 0.0020725731774577944], 'stage3_unit5_relu1': [0, 0.0027312059102095956], 'stage3_unit6_conv1': [0, 0.004092128258051835], 'stage3_unit6_bn1': [0, 0.004856141533438615], 'stage3_unit6_conv2': [0, 0.002332683388642439], 'stage3_unit6_relu1': [0, 0.0038670808311522475], 'stage3_unit7_conv1': [0, 0.0023217304485050713], 'stage3_unit7_bn1': [0, 0.005844213831143116], 'stage3_unit7_conv2': [0, 0.0012914266173295148], 'stage3_unit7_relu1': [0, 0.0028547083768318956], 'stage3_unit8_conv1': [0, 0.0023769933407700905], 'stage3_unit8_bn1': [0, 0.0043068068234000615], 'stage3_unit8_conv2': [0, 0.002196611851219117], 'stage3_unit8_relu1': [0, 0.00395574672954289], 'stage3_unit9_conv1': [0, 0.0042939035911259684], 'stage3_unit9_bn1': [0, 0.00570553916645801], 'stage3_unit9_conv2': [0, 0.001235060569808239], 'stage3_unit9_relu1': [0, 0.0025175781700554796], 'stage3_unit10_conv1': [0, 0.004020637883914737], 'stage3_unit10_bn1': [0, 0.0049915984859616735], 'stage3_unit10_conv2': [0, 0.0017236303391419058], 'stage3_unit10_relu1': [0, 0.0026307267936195914], 'stage3_unit11_conv1': [0, 0.003428289974768331], 'stage3_unit11_bn1': [0, 0.004625562607772707], 'stage3_unit11_conv2': [0, 0.0025534979471071497], 'stage3_unit11_relu1': [0, 0.0026015184057040478], 'stage3_unit12_conv1': [0, 0.0032589811039721874], 'stage3_unit12_bn1': [0, 0.0049844248088326045], 'stage3_unit12_conv2': [0, 0.003412988007537962], 'stage3_unit12_relu1': [0, 0.0038257264715480053], 'stage3_unit13_conv1': [0, 0.0029506615297062192], 'stage3_unit13_bn1': [0, 0.006582278435624491], 'stage3_unit13_conv2': [0, 0.0013310282718478226], 'stage3_unit13_relu1': [0, 0.0030190653688325656], 'stage3_unit14_conv1': [0, 0.0020184263469666007], 'stage3_unit14_bn1': [0, 0.006445717154525396], 'stage3_unit14_conv2': [0, 0.0008642697545487111], 'stage3_unit14_relu1': [0, 0.0025302279652572993], 'stage3_unit15_conv1': [0, 0.0026751275606981414], 'stage3_unit15_bn1': [0, 0.004429226315866305], 'stage3_unit15_conv2': [0, 0.0016296920109921554], 'stage3_unit15_relu1': [0, 0.0026190998986011416], 'stage3_unit16_conv1': [0, 0.0012866996639356839], 'stage3_unit16_bn1': [0, 0.00609699478299599], 'stage3_unit16_conv2': [0, 0.001235720562183951], 'stage3_unit16_relu1': [0, 0.0024308613904817835], 'stage3_unit17_conv1': [0, 0.0030087166883814055], 'stage3_unit17_bn1': [0, 0.005449368728427436], 'stage3_unit17_conv2': [0, 0.0020825581287774514], 'stage3_unit17_relu1': [0, 0.003656094703148669], 'stage3_unit18_conv1': [0, 0.0030028270924185203], 'stage3_unit18_bn1': [0, 0.005156463994754581], 'stage3_unit18_conv2': [0, 0.001348726510062931], 'stage3_unit18_relu1': [0, 0.00352265891127699], 'stage3_unit19_conv1': [0, 0.0022132605548918715], 'stage3_unit19_bn1': [0, 0.008421568419989638], 'stage3_unit19_conv2': [0, 0.0016626281531776969], 'stage3_unit19_relu1': [0, 0.004618875154360073], 'stage3_unit20_conv1': [0, 0.002569252111780362], 'stage3_unit20_bn1': [0, 0.005869278757590947], 'stage3_unit20_conv2': [0, 0.0012604762015380258], 'stage3_unit20_relu1': [0, 0.0036138432232413705], 'stage3_unit21_conv1': [0, 0.0016877652387919388], 'stage3_unit21_bn1': [0, 0.006444983125671627], 'stage3_unit21_conv2': [0, 0.0015283700988048643], 'stage3_unit21_relu1': [0, 0.0023598905623428467], 'stage3_unit22_conv1': [0, 0.0011096423066507175], 'stage3_unit22_bn1': [0, 0.005307595091541921], 'stage3_unit22_conv2': [0, 0.0016137707655824076], 'stage3_unit22_relu1': [0, 0.001991969863260825], 'stage3_unit23_conv1': [0, 0.0011686222759757455], 'stage3_unit23_bn1': [0, 0.005606956369294895], 'stage3_unit23_conv2': [0, 0.0007499285100951908], 'stage3_unit23_relu1': [0, 0.00162859495699875], 'stage3_unit24_conv1': [0, 0.0017282438325131033], 'stage3_unit24_bn1': [0, 0.006576113344177487], 'stage3_unit24_conv2': [0, 0.0012313588632373359], 'stage3_unit24_relu1': [0, 0.002739064571425671], 'stage3_unit25_conv1': [0, 0.0013670245493490865], 'stage3_unit25_bn1': [0, 0.004078700786500465], 'stage3_unit25_conv2': [0, 0.0007711785046134408], 'stage3_unit25_relu1': [0, 0.0016410070376133355], 'stage3_unit26_conv1': [0, 0.0016500723408901785], 'stage3_unit26_bn1': [0, 0.004552193983333317], 'stage3_unit26_conv2': [0, 0.0017180087294165543], 'stage3_unit26_relu1': [0, 0.002048210130901787], 'stage3_unit27_conv1': [0, 0.0018142065194648083], 'stage3_unit27_bn1': [0, 0.0037283329513129286], 'stage3_unit27_conv2': [0, 0.0005505358961623484], 'stage3_unit27_relu1': [0, 0.0017541655170635914], 'stage3_unit28_conv1': [0, 0.0027773504651437595], 'stage3_unit28_bn1': [0, 0.005445079071315255], 'stage3_unit28_conv2': [0, 0.002022677284526074], 'stage3_unit28_relu1': [0, 0.0028654172664552223], 'stage3_unit29_conv1': [0, 0.0023160529887582375], 'stage3_unit29_bn1': [0, 0.005417428617402325], 'stage3_unit29_conv2': [0, 0.002053796775697723], 'stage3_unit29_relu1': [0, 0.001805328947352612], 'stage3_unit30_conv1': [0, 0.0019281831547969907], 'stage3_unit30_bn1': [0, 0.004213685125816526], 'stage3_unit30_conv2': [0, 0.0007699149565433892], 'stage3_unit30_relu1': [0, 0.0020882252633102295], 'stage4_unit1_conv1': [0, 0.0039138068833689055], 'stage4_unit1_bn1': [0, 0.0057999303960424705], 'stage4_unit1_conv2': [0, 0.0008152449928869413], 'stage4_unit1_relu1': [0, 0.002968962736955778], 'stage4_unit1_conv1sc': [0, 0.011939589432844027], '_plus45': [0, 0.03649481825941191], 'stage4_unit2_conv1': [0, 0.0010706278986818208], 'stage4_unit2_bn1': [0, 0.0020408686690443142], 'stage4_unit2_conv2': [0, 0.00030984175134831526], 'stage4_unit2_relu1': [0, 0.0011117474769982766], 'stage4_unit3_conv1': [0, 1], 'stage4_unit3_bn1': [0, 5.607301321918843e-41], 'stage4_unit3_conv2': [0, 1], 'stage4_unit3_relu1': [0, 1.555687350170267e-41]}
Warning: The original model opset version is 9, which does not support quantized operators.
The opset version of quantized model will be set to 10. Use onnx model checker to verify model after quantization.
Traceback (most recent call last):
File "calibrate.py", line 379, in <module>
main()
File "calibrate.py", line 372, in main
symmetric_weight=args.mode == 'int8')
File "/home/zenk/onnxruntime-riscv/systolic_runner/quantization/quantize.py", line 1418, in quantize
quantizer.quantize_model()
File "/home/zenk/onnxruntime-riscv/systolic_runner/quantization/quantize.py", line 312, in quantize_model
new_list += self._quantize_convolution(node, new_list)
File "/home/zenk/onnxruntime-riscv/systolic_runner/quantization/quantize.py", line 1296, in _quantize_convolution
return self._quantize_convolution_qlinear_ops(node, new_nodes_list)
File "/home/zenk/onnxruntime-riscv/systolic_runner/quantization/quantize.py", line 1188, in _quantize_convolution_qlinear_ops
self._get_quantization_params(node.output[0])
File "/home/zenk/onnxruntime-riscv/systolic_runner/quantization/quantize.py", line 644, in _get_quantization_params
scale_values = [params[1].item()]
AttributeError: 'int' object has no attribute 'item'
Hm that's very weird. Looks like it's a bug in the quantization script then. I can reproduce this on my end so I'll take a look and see if I can figure out what's going on.
If you'd also like to try debugging this, maybe you could surround
zero_point_values = [params[0].item()]
with a try-except and set a pdb breakpoint in the except. That way you can print out the parameters and take a look. It seems weird that params[1]
exists, but the type isn't what's expected.
Ok fixed it! It's a 1 line change :)
https://github.com/pranav-prakash/onnxruntime-riscv/commit/e12f338c469ef09be1c7b8e950aed5202c6fed22
The issue was that the calibration script was returning a non-numpy type in the edge-case where rmin == rmax
as it was calculating the scale for the quantization parameters. After this fix it successfully saves the quantized model.
Please keep me posted on your results. I'd be interested to see how well the quantized model performs in terms of accuracy. There's a good chance the accuracy might be off at first due to several reasons
input_0.pb
has already been preprocessed, so you might want to try --data_preprocess=None
when running quantization scriptuint8
instead of int8
which uses the native uint8
support in onnxruntime and doesn't suffer from the power-of-2 scale issue. That will quickly tell you if the issue is with the quantization step or with the inference step.Sorry for the late reply. The network now can be quantized. But actually I have not run the network successfully. There are some errors and seems caused by the nerual network. The first error is when running the network with the unmodified runner. And this is the message.
Gemmini extension configured with:
dim = 16
bbl loader
Loaded runner program
Using systolic in mode 1
Using Onnxruntime C++ API
Number of inputs = 1
Input 0 : name=data, type=1, num_dims=4: [1, 3, 112, 112, ]
Number of outputs = 1
Output 0 : name=fc1, type=1, num_dims=2: [1, 512, ]
Loading image
Image dimensions: 224 224 3
First few image values 1.187174 1.426920 1.255673
Called into systolic matmul!
Using accelerated matmul with dimensions (64, 12544, 27)
Called into systolic matmul!
Using accelerated matmul with dimensions (64, 3136, 64)
Called into systolic matmul!
Using accelerated matmul with dimensions (64, 12544, 576)
Called into systolic matmul!
Using accelerated matmul with dimensions (64, 3136, 576)
Called into systolic matmul!
Using accelerated matmul with dimensions (64, 3136, 576)
Called into systolic matmul!
......
Using accelerated matmul with dimensions (512, 49, 4608)
Called into systolic matmul!
Using accelerated matmul with dimensions (512, 49, 4608)
Called into systolic matmul!
Using accelerated matmul with dimensions (512, 49, 4608)
Called into systolic matmul!
1970-01-01 08:00:01.975564019 [E:onnxruntime:, sequential_executor.cc:277 Execute] Non-zero status code returned while running QLinearConv node. Name:'stage4_unit3_conv1_quant' Status Message: Divisor passed to systolic matmul must be power of 2
terminate called after throwing an instance of 'Ort::Exception'
what(): Non-zero status code returned while running QLinearConv node. Name:'stage4_unit3_conv1_quant' Status Message: Divisor passed to systolic matmul must be power of 2
bad syscall #131!
And I tried to use the arcface_validation
that they provide to measure the accuracy. But the quantized network cannot be loaded by the vaildation program.
I tried to measure the origin network, too. And it still failed. It gives the information like this: Cannot broadcast gamma to data. gamma: [1,64,1,1], data: [1,64,112,112]
. I find a issue about this and it seems that there is no solution yet.
I also noticed the mlperf benchmark
that mentioned by the readme document. And it seems do not support this network.
I do not know if the runner's fixed input size can cause the first error(I still use the 224*224 picture to run). If not, this network may not suitable to continue.
Thanks a lot for your helping.
The first error is when running the network with the unmodified runner
Status Message: Divisor passed to systolic matmul must be power of 2
That's an error that shouldn't happen since we make sure to round to the nearest power. Can you add a print
within this function https://github.com/pranav-prakash/onnxruntime-riscv/blob/4c7a0ad5c94fb90bbcc5c876c73e940d3c67d37d/onnxruntime/core/providers/systolic/helper/helper.h to see what the input and output are?
Edit: Ok I can reproduce this as well. I'll investigate and see what's happening.
Also this is unrelated to the error you got, but it seems that the network takes as input a 112x112
image that is the result of preprocessing as described here. You should be able to feed your 224x224 image through the preprocess script and then change the runner script to load 112x112 images. I'm not sure why the arcface documentation says "There are no constraints on the size of the image" when the model seems to expect a 112x112 image and the training data also used 112x112 images.
As for the mlperf benchmark
, I have not tried that (that comes from microsoft), and it will likely not work with the gemmini backend anyway.
Finally, with regard to
I tried to measure the origin network, too. And it still failed.
Maybe it's better to export a clean model from pytorch? It seems the original model might be broken as described in the issue you linked. Alternatively maybe try upgrading the opset version using convert_to_opset9.py
and see if it helps?
Ok found the issue. It's again an issue with the calibration/quantization script and in the same line as before. In this case rmin
and rmax
were both on the order of 1E-41
so the scale was essentially 0
and it overflowed an int
when dividing by it. Fix is to not compare the floats directly but instead use np.isclose
which has a default tolerance of 1E-8
which should be fine. Thanks for discovering these bugs!
Fixed in commit https://github.com/pranav-prakash/onnxruntime-riscv/commit/6475932331f1a0f223221eb40c79959b0f58741f (and a further correction to that fix in https://github.com/pranav-prakash/onnxruntime-riscv/commit/c0577996c8ac73075eea6f250e67bbf86631848e)
After this I can successfully run the model using the runner script (I have not tried post-processing the output so I don't know if it's accurate though).
I should probably also update the assertions in the cpp file since ORT_ENFORCE(X_scale_value != 0, "X_scale_value cannot be 0");
suffers from the same issue, but this shouldn't matter for now.
Good, now I can run the quantized network. But I cannot run the orign network to get a comparison because of the Cannot broadcast gamma to data. gamma: [1,64,1,1], data: [1,64,112,112]
problem. So I will try to build a new network and quantize it. If there are any problems later, I will open a new issue. Thank you very much for the days help.
Describe the bug When run onnx models here
https://github.com/pranav-prakash/onnxruntime-riscv/releases/tag/v0.01
, I gotbad syscall #131!
I have tried googlenet_quantized.onnx, mobilenet_quantized_optimized.onnx and resnet50_quantized.onnx, only the resnet could run normally.System information
To Reproduce
https://github.com/pranav-prakash/onnxruntime-riscv/releases/tag/v0.01
spike --extension=gemmini pk ort_test -m googlenet_quantized.onnx -i images/cat.jpg -p caffe2 -x 1 -O 0
Here is the resultbad syscall #131!
Traceback (most recent call last): File "calibrate.py", line 379, in
main()
File "calibrate.py", line 348, in main
args.data_preprocess)
File "calibrate.py", line 289, in load_test_data
preprocess_method)
File "calibrate.py", line 261, in load_single_test_data
'Number of input protobufs does not match expected model inputs')
ValueError: Number of input protobufs does not match expected model inputs