deephealthproject / eddl

European Distributed Deep Learning (EDDL) library. A general-purpose library initially developed to cover deep learning needs in healthcare use cases within the DeepHealth project.
https://deephealthproject.github.io/eddl/
MIT License
34 stars 10 forks source link

Problem in deserialization of an ONNX model #328

Closed thistlillo closed 2 years ago

thistlillo commented 2 years ago

I posted this bug report on the PyEDDL section. I file a new issue here since the bug has been reproduced also in C++.

The C++ code is available here

I am not able to read a model containing a LSTM layer from an onnx file, saved after training the model.

Settings:

Python 3.8.6 | packaged by conda-forge | (default, Oct  7 2020, 19:08:05) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyeddl 
>>> import pyecvl
>>> print(pyeddl.__version__)
1.2.0
>>> print(pyecvl.__version__)
1.0.0

You can run the code below to try and replicate the issue:

import pyeddl.eddl as eddl
from pyeddl.tensor import Tensor
import numpy as np

bs = 20
visual_dim = 3
semantic_dim = 3
vs = 3
emb_size = 200
n_tokens = 3

visual_x = Tensor.fromarray(np.random.randn( bs, visual_dim ))
semantic_x = Tensor.fromarray(np.random.randn( bs, semantic_dim ))

text = Tensor.fromarray(np.random.randint(0, vs+1, size=(bs, n_tokens, vs)))
text = Tensor.onehot(text, vs)

print(type(visual_x))
print(type(semantic_x))
print(type(text))

#print(visual_x.getdata())
#print(semantic_x.getdata())
#print(text.getdata())

# INPUT: visual features
cnn_top_in = eddl.Input([visual_dim], name="in_visual_features")
visual_features = eddl.RandomUniform(eddl.Dense(cnn_top_in, cnn_top_in.output.shape[1], name="visual_features") )
alpha_v = eddl.Softmax(eddl.Dense(eddl.Tanh(visual_features), visual_features.output.shape[1]), name="alpha_v")  # missing sentence component
v_att = eddl.Mult(alpha_v, visual_features)
print(f"layer visual features: {visual_features.output.shape}")

# INPUT: semantic features
cnn_out_in = eddl.Input([semantic_dim], name="in_semantic_features")
semantic_features = eddl.RandomUniform(eddl.Embedding(eddl.ReduceArgMax(cnn_out_in, [0]), cnn_out_in.output.shape[1], 1, emb_size, name="semantic_features"), -0.05, 0.05)
alpha_s = eddl.Softmax(eddl.Dense(eddl.Tanh(semantic_features), emb_size), name="alpha_s")  # missing sentence component cnn_out.output.shape[1]
s_att = eddl.Mult(alpha_s, semantic_features)
print(f"layer semantic features: {semantic_features.output.shape}")

# co-attention (not exactly, just a reference to the full model)
features = eddl.Concat([v_att, s_att], name="co_att_in")
context = eddl.Dense(features, emb_size, name="co_attention")
print(f"layer coattention: {context.output.shape}")

# lstm
word_in = eddl.Input([vs])
to_lstm = eddl.ReduceArgMax(word_in, [0])
to_lstm = eddl.RandomUniform(eddl.Embedding(to_lstm, vs, 1, emb_size, mask_zeros=True, name="word_embeddings"), -0.05, +0.05)
to_lstm = eddl.Concat([to_lstm, context])
lstm = eddl.LSTM(to_lstm, emb_size, mask_zeros=True, bidirectional=False, name="lstm")
eddl.setDecoder(word_in)

out_lstm = eddl.Softmax(eddl.Dense(lstm, vs, name="top_dense"), name="rnn_out")
print(f"layer lstm: {out_lstm.output.shape}")

# model
rnn = eddl.Model([cnn_top_in, cnn_out_in], [out_lstm])
eddl.build(rnn, eddl.adam(0.001), ["softmax_cross_entropy"], ["accuracy"], eddl.CS_GPU(g=[1], mem="full_mem"))
eddl.summary(rnn)

eddl.fit(rnn, [visual_x, semantic_x], [text], bs, 1)

eddl.save_net_to_onnx_file(rnn, "rnn.onnx")
# XXX
rnn = eddl.import_net_from_onnx_file("rnn.onnx")

The output is:

<class 'pyeddl._core.Tensor'>
<class 'pyeddl._core.Tensor'>
<class 'pyeddl._core.Tensor'>
layer visual features: [1, 3]
layer semantic features: [1, 200]
layer coattention: [1, 200]
layer lstm: [1, 3]
Generating Random Table
CS with full memory setup
Building model
Selecting GPU device 0
EDDL is running on GPU device 0, Tesla T4
CuBlas initialized on GPU device 0, Tesla T4
CuRand initialized on GPU device 0, Tesla T4
CuDNN initialized on GPU device 0, Tesla T4
-------------------------------------------------------------------------------
model
-------------------------------------------------------------------------------
in_visual_features  |  (3)                 =>   (3)                 0         
visual_features     |  (3)                 =>   (3)                 12        
tanh1               |  (3)                 =>   (3)                 0         
dense1              |  (3)                 =>   (3)                 12        
alpha_v             |  (3)                 =>   (3)                 0         
mult_1              |  (3)                 =>   (3)                 0         
in_semantic_features|  (3)                 =>   (3)                 0         
reduction_argmax1   |  (3)                 =>   (1)                 0         
semantic_features   |  (1)                 =>   (200)               600       
tanh2               |  (200)               =>   (200)               0         
dense2              |  (200)               =>   (200)               40200     
alpha_s             |  (200)               =>   (200)               0         
mult_2              |  (200)               =>   (200)               0         
co_att_in           |  (3)                 =>   (203)               0         
co_attention        |  (203)               =>   (200)               40800     
input1              |  (3)                 =>   (3)                 0         
reduction_argmax2   |  (3)                 =>   (1)                 0         
word_embeddings     |  (1)                 =>   (200)               600       
concat1             |  (200)               =>   (400)               0         
lstm                |  (400)               =>   (200)               480800    
top_dense           |  (200)               =>   (3)                 603       
rnn_out             |  (3)                 =>   (3)                 0         
-------------------------------------------------------------------------------
Total params: 563627
Trainable params: 563627
Non-trainable params: 0

Vec2Seq 1 to 3
Recurrent net output sequence length=3
CS with full memory setup
Building model without initialization
Unroll on device
Recurrent net output sequence length=3
1 epochs of 1 batches of size 20
Epoch 1
[██████████████████████████████████████████████████] 1 rnn_out[loss=1.069 metric=0.600] 0.0636 secs/batch
0.0636 secs/epoch
[ONNX::Export] Warning: The LSTM layer lstm has mask_zeros=true. This attribute is not supported in ONNX, so the model exported will not have this attribute.
==================================================================
⚠️  LDense only works over 2D tensors (LDense) ⚠️
==================================================================

Traceback (most recent call last):
  File "2022_lstm.py", line 66, in <module>
    rnn = eddl.import_net_from_onnx_file("rnn.onnx")
  File "/root/miniconda3/envs/pyeddl-cudnn/lib/python3.8/site-packages/pyeddl/eddl.py", line 2889, in import_net_from_onnx_file
    return _eddl.import_net_from_onnx_file(path, mem, log_level)
RuntimeError: RuntimeError: LDense
chavicoski commented 2 years ago

Hi,

I was able to reproduce the error with the C++ version of the EDDL. The "develop" branch has a fix related to the export/import of this kind of decoder model, so I also tried to execute the same code with the "develop" version, and it worked. So the fix will be available for the PyEDDL in the next release.

Thanks!