zetane / viewer

ML models and internal tensors 3D visualizer
1.11k stars 122 forks source link

BUG: Viewer crashes when loading any model #3

Closed paulgavrikov closed 3 years ago

paulgavrikov commented 3 years ago

I've tried loading multiple models including emotion-ferplus (both onnx and ztn formats) but they always immediately crash the viewer.

OS: Ubuntu 20.04 Zetane 1.3.2 Dump:

LoadUniverse(): ZTN_REQUIRE_LOGIN = 1 
online = 0 
================== ExposeIRnodes: ================== 
@@@ ExposeIRnodes() n_IR_outputs = 51. 
 <- [Parameter1367_reshape1. 
 <- [Minus340_Output_0. 
 <- [Block352_Output_0. 
 <- [Convolution362_Output_0. 
 <- [Plus364_Output_0. 
 <- [ReLU366_Output_0. 
 <- [Convolution380_Output_0. 
 <- [Plus382_Output_0. 
 <- [ReLU384_Output_0. 
 <- [Pooling398_Output_0. 
 <- [Dropout408_Output_0. 
 <- [Convolution418_Output_0. 
 <- [Plus420_Output_0. 
 <- [ReLU422_Output_0. 
 <- [Convolution436_Output_0. 
 <- [Plus438_Output_0. 
 <- [ReLU440_Output_0. 
 <- [Pooling454_Output_0. 
 <- [Dropout464_Output_0. 
 <- [Convolution474_Output_0. 
 <- [Plus476_Output_0. 
 <- [ReLU478_Output_0. 
 <- [Convolution492_Output_0. 
 <- [Plus494_Output_0. 
 <- [ReLU496_Output_0. 
 <- [Convolution510_Output_0. 
 <- [Plus512_Output_0. 
 <- [ReLU514_Output_0. 
 <- [Pooling528_Output_0. 
 <- [Dropout538_Output_0. 
 <- [Convolution548_Output_0. 
 <- [Plus550_Output_0. 
 <- [ReLU552_Output_0. 
 <- [Convolution566_Output_0. 
 <- [Plus568_Output_0. 
 <- [ReLU570_Output_0. 
 <- [Convolution584_Output_0. 
 <- [Plus586_Output_0. 
 <- [ReLU588_Output_0. 
 <- [Pooling602_Output_0. 
 <- [Dropout612_Output_0. 
 <- [Dropout612_Output_0_reshape0. 
 <- [Times622_Output_0. 
 <- [Plus624_Output_0. 
 <- [ReLU636_Output_0. 
 <- [Dropout646_Output_0. 
 <- [Times656_Output_0. 
 <- [Plus658_Output_0. 
 <- [ReLU670_Output_0. 
 <- [Dropout680_Output_0. 
 <- [Times690_Output_0. 
node_name = Node_0000000000_Times622_reshape1_Reshape. 
 -> [Parameter1367_reshape1. 
node_name = Node_0000000001_Minus340_Sub. 
 -> [Minus340_Output_0. 
node_name = Node_0000000002_Block352_Div. 
 -> [Block352_Output_0. 
node_name = Node_0000000003_Convolution362_Conv. 
 -> [Convolution362_Output_0. 
node_name = Node_0000000004_Plus364_Add. 
 -> [Plus364_Output_0. 
node_name = Node_0000000005_ReLU366_Relu. 
 -> [ReLU366_Output_0. 
node_name = Node_0000000006_Convolution380_Conv. 
 -> [Convolution380_Output_0. 
node_name = Node_0000000007_Plus382_Add. 
 -> [Plus382_Output_0. 
node_name = Node_0000000008_ReLU384_Relu. 
 -> [ReLU384_Output_0. 
node_name = Node_0000000009_Pooling398_MaxPool. 
 -> [Pooling398_Output_0. 
node_name = Node_0000000010_Dropout408_Dropout. 
 -> [Dropout408_Output_0. 
node_name = Node_0000000011_Convolution418_Conv. 
 -> [Convolution418_Output_0. 
node_name = Node_0000000012_Plus420_Add. 
 -> [Plus420_Output_0. 
node_name = Node_0000000013_ReLU422_Relu. 
 -> [ReLU422_Output_0. 
node_name = Node_0000000014_Convolution436_Conv. 
 -> [Convolution436_Output_0. 
node_name = Node_0000000015_Plus438_Add. 
 -> [Plus438_Output_0. 
node_name = Node_0000000016_ReLU440_Relu. 
 -> [ReLU440_Output_0. 
node_name = Node_0000000017_Pooling454_MaxPool. 
 -> [Pooling454_Output_0. 
node_name = Node_0000000018_Dropout464_Dropout. 
 -> [Dropout464_Output_0. 
node_name = Node_0000000019_Convolution474_Conv. 
 -> [Convolution474_Output_0. 
node_name = Node_0000000020_Plus476_Add. 
 -> [Plus476_Output_0. 
node_name = Node_0000000021_ReLU478_Relu. 
 -> [ReLU478_Output_0. 
node_name = Node_0000000022_Convolution492_Conv. 
 -> [Convolution492_Output_0. 
node_name = Node_0000000023_Plus494_Add. 
 -> [Plus494_Output_0. 
node_name = Node_0000000024_ReLU496_Relu. 
 -> [ReLU496_Output_0. 
node_name = Node_0000000025_Convolution510_Conv. 
 -> [Convolution510_Output_0. 
node_name = Node_0000000026_Plus512_Add. 
 -> [Plus512_Output_0. 
node_name = Node_0000000027_ReLU514_Relu. 
 -> [ReLU514_Output_0. 
node_name = Node_0000000028_Pooling528_MaxPool. 
 -> [Pooling528_Output_0. 
node_name = Node_0000000029_Dropout538_Dropout. 
 -> [Dropout538_Output_0. 
node_name = Node_0000000030_Convolution548_Conv. 
 -> [Convolution548_Output_0. 
node_name = Node_0000000031_Plus550_Add. 
 -> [Plus550_Output_0. 
node_name = Node_0000000032_ReLU552_Relu. 
 -> [ReLU552_Output_0. 
node_name = Node_0000000033_Convolution566_Conv. 
 -> [Convolution566_Output_0. 
node_name = Node_0000000034_Plus568_Add. 
 -> [Plus568_Output_0. 
node_name = Node_0000000035_ReLU570_Relu. 
 -> [ReLU570_Output_0. 
node_name = Node_0000000036_Convolution584_Conv. 
 -> [Convolution584_Output_0. 
node_name = Node_0000000037_Plus586_Add. 
 -> [Plus586_Output_0. 
node_name = Node_0000000038_ReLU588_Relu. 
 -> [ReLU588_Output_0. 
node_name = Node_0000000039_Pooling602_MaxPool. 
 -> [Pooling602_Output_0. 
node_name = Node_0000000040_Dropout612_Dropout. 
 -> [Dropout612_Output_0. 
node_name = Node_0000000041_Times622_reshape0_Reshape. 
 -> [Dropout612_Output_0_reshape0. 
node_name = Node_0000000042_Times622_MatMul. 
 -> [Times622_Output_0. 
node_name = Node_0000000043_Plus624_Add. 
 -> [Plus624_Output_0. 
node_name = Node_0000000044_ReLU636_Relu. 
 -> [ReLU636_Output_0. 
node_name = Node_0000000045_Dropout646_Dropout. 
 -> [Dropout646_Output_0. 
node_name = Node_0000000046_Times656_MatMul. 
 -> [Times656_Output_0. 
node_name = Node_0000000047_Plus658_Add. 
 -> [Plus658_Output_0. 
node_name = Node_0000000048_ReLU670_Relu. 
 -> [ReLU670_Output_0. 
node_name = Node_0000000049_Dropout680_Dropout. 
 -> [Dropout680_Output_0. 
node_name = Node_0000000050_Times690_MatMul. 
 -> [Times690_Output_0. 
node_name = Node_0000000051_Plus692_Add. 
@@@ ExposeIRnodes() [Outputs] = 1 --> 52. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
***************** ValidateIRnodes: ***************** 
====================================  
input_dims = [ 1, 1, 64, 64, ]. 
--> input 0[Input3]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ ]. 
--> input 1[Constant339]: Type 1; [0 dims] tensor  
--------------- 
input_dims = [ ]. 
--> input 2[Constant343]: Type 1; [0 dims] tensor  
--------------- 
input_dims = [ 64, 1, 3, 3, ]. 
--> input 3[Parameter3]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 64, 1, 1, ]. 
--> input 4[Parameter4]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 64, 64, 3, 3, ]. 
--> input 5[Parameter23]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 64, 1, 1, ]. 
--> input 6[Parameter24]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 128, 64, 3, 3, ]. 
--> input 7[Parameter63]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 128, 1, 1, ]. 
--> input 8[Parameter64]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 128, 128, 3, 3, ]. 
--> input 9[Parameter83]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 128, 1, 1, ]. 
--> input 10[Parameter84]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 256, 128, 3, 3, ]. 
--> input 11[Parameter575]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 256, 1, 1, ]. 
--> input 12[Parameter576]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 256, 256, 3, 3, ]. 
--> input 13[Parameter595]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 256, 1, 1, ]. 
--> input 14[Parameter596]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 256, 256, 3, 3, ]. 
--> input 15[Parameter615]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 256, 1, 1, ]. 
--> input 16[Parameter616]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 256, 256, 3, 3, ]. 
--> input 17[Parameter655]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 256, 1, 1, ]. 
--> input 18[Parameter656]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 256, 256, 3, 3, ]. 
--> input 19[Parameter675]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 256, 1, 1, ]. 
--> input 20[Parameter676]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 256, 256, 3, 3, ]. 
--> input 21[Parameter695]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 256, 1, 1, ]. 
--> input 22[Parameter696]: Type 1; [3 dims] tensor  
--------------- 
input_dims = [ 2, ]. 
--> input 23[Dropout612_Output_0_reshape0_shape]: Type 7; [1 dims] tensor  
--------------- 
input_dims = [ 256, 4, 4, 1024, ]. 
--> input 24[Parameter1367]: Type 1; [4 dims] tensor  
--------------- 
input_dims = [ 2, ]. 
--> input 25[Parameter1367_reshape1_shape]: Type 7; [1 dims] tensor  
--------------- 
input_dims = [ 1024, ]. 
--> input 26[Parameter1368]: Type 1; [1 dims] tensor  
--------------- 
input_dims = [ 1024, 1024, ]. 
--> input 27[Parameter1403]: Type 1; [2 dims] tensor  
--------------- 
input_dims = [ 1024, ]. 
--> input 28[Parameter1404]: Type 1; [1 dims] tensor  
--------------- 
input_dims = [ 1024, 8, ]. 
--> input 29[Parameter1693]: Type 1; [2 dims] tensor  
--------------- 
input_dims = [ 8, ]. 
--> input 30[Parameter1694]: Type 1; [1 dims] tensor  
--------------- 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
output_dims = [ 1, 8, ]. 
--> output 0/1[Plus692_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 4096, 1024, ]. 
--> output 1/1[Parameter1367_reshape1]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1, 64, 64, ]. 
--> output 2/1[Minus340_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 1, 64, 64, ]. 
--> output 3/1[Block352_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 64, 64, ]. 
--> output 4/1[Convolution362_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 64, 64, ]. 
--> output 5/1[Plus364_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 64, 64, ]. 
--> output 6/1[ReLU366_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 64, 64, ]. 
--> output 7/1[Convolution380_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 64, 64, ]. 
--> output 8/1[Plus382_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 64, 64, ]. 
--> output 9/1[ReLU384_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 32, 32, ]. 
--> output 10/1[Pooling398_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 64, 32, 32, ]. 
--> output 11/1[Dropout408_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 32, 32, ]. 
--> output 12/1[Convolution418_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 32, 32, ]. 
--> output 13/1[Plus420_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 32, 32, ]. 
--> output 14/1[ReLU422_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 32, 32, ]. 
--> output 15/1[Convolution436_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 32, 32, ]. 
--> output 16/1[Plus438_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 32, 32, ]. 
--> output 17/1[ReLU440_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 16, 16, ]. 
--> output 18/1[Pooling454_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 128, 16, 16, ]. 
--> output 19/1[Dropout464_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 20/1[Convolution474_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 21/1[Plus476_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 22/1[ReLU478_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 23/1[Convolution492_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 24/1[Plus494_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 25/1[ReLU496_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 26/1[Convolution510_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 27/1[Plus512_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 16, 16, ]. 
--> output 28/1[ReLU514_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 29/1[Pooling528_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 30/1[Dropout538_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 31/1[Convolution548_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 32/1[Plus550_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 33/1[ReLU552_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 34/1[Convolution566_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 35/1[Plus568_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 36/1[ReLU570_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 37/1[Convolution584_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 38/1[Plus586_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 8, 8, ]. 
--> output 39/1[ReLU588_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 4, 4, ]. 
--> output 40/1[Pooling602_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 256, 4, 4, ]. 
--> output 41/1[Dropout612_Output_0]: Type 1; [4 dims] tensor  
--------------- 
output_dims = [ 1, 4096, ]. 
--> output 42/1[Dropout612_Output_0_reshape0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 43/1[Times622_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 44/1[Plus624_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 45/1[ReLU636_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 46/1[Dropout646_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 47/1[Times656_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 48/1[Plus658_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 49/1[ReLU670_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 1024, ]. 
--> output 50/1[Dropout680_Output_0]: Type 1; [2 dims] tensor  
--------------- 
output_dims = [ 1, 8, ]. 
--> output 51/1[Times690_Output_0]: Type 1; [2 dims] tensor  
--------------- 
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 
ValidateIRnodes() 52 --> 52=52=52 valid output tensors  
--------------- 
----------------- ValidateIRnodes. ----------------- 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
*** ExposeIRnodes: 1 --> 52 Outputs. 
*** type: [FLOAT] ~?= STRING 
TVZ10()  input_dims = [ 1, 1, 64, 64, ]. 
--> input [0]: Type FLOAT; [4 dims] tensor  
--------------- 
Warning: Could not load "/opt/zetane/lib/graphviz/libgvplugin_pango.so.6" - file not found
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stod
/usr/bin/zetane: line 26: 37102 Aborted                 (core dumped) ./Zetane --server
jmagoon commented 3 years ago

Thanks for the report, looking into this now.

jmagoon commented 3 years ago

Hi Paul,

We tested internally with 18.04 and 20.04 and were unable to replicate the crash, do you mind trying our Dockerfile to see if it's a dependency issue? The guide is here and covers both Nvidia and non-Nvidia machines.

konafah commented 3 years ago

Hi @paulgavrikov ; thank you for the bug report. We've finally been able to repro this issue, and the fix will be part of our upcoming release (we'll notify here when it's ready). Cheers.

jmagoon commented 3 years ago

Fixed in release 1.4.0