deephealthproject / eddl

European Distributed Deep Learning (EDDL) library. A general-purpose library initially developed to cover deep learning needs in healthcare use cases within the DeepHealth project.
https://deephealthproject.github.io/eddl/
MIT License
34 stars 10 forks source link

This is a bug. #163

Closed CostantinoGrana closed 4 years ago

CostantinoGrana commented 4 years ago

Check this out: https://github.com/deephealthproject/eddl/blob/02e37c0dfb674468495f4d0ae3c159de3b2d3cc0/src/serialization/onnx/eddl_onnx_import.cpp#L801-L815 In line 803 you create a Tensor which refers to a vector data. Unfortunately by design Tensor assumes that the pointer passed has been allocated with new and that it is its own property: gets ownership: it will try to delete it. This is wrong in both assumptions: std::vector<> is not allocated with new, nor you can steal its data. So in line 815 delete will fail.

It would be useful to have a dont_delete boolean flag in Tensor.

And in line 801 neuronas looks very Spanish... 😄

CostantinoGrana commented 4 years ago

Same problem here: https://github.com/deephealthproject/eddl/blob/02e37c0dfb674468495f4d0ae3c159de3b2d3cc0/src/serialization/onnx/eddl_onnx_import.cpp#L820

Quick fix is to change bias->data() to copy(bias->begin(), bias->end(), new float[bias->size()]) - bias->size() Not nice, but it works.

Unfortunately after fixing this, I realized that the problem is everywhere in the file. A lot of fixing required, or change the logic of Tensor borrowed memory deleting.

salvacarrion commented 4 years ago

I agree. Indeed, we were already working on that in another branch ("tensor_refactor"). For the release v0.7, that method will be either removed or marked as private since it is quite easy to mess things up (e.g.: trying to delete static data, passing a CPU pointer to a GPU tensor, or vice-versa, etc).

For now, @Seraphid could add your fixed as a patch

RParedesPalacios commented 4 years ago

Costantino, thanks for detecting that.

CostantinoGrana commented 4 years ago

Just to complete it, this leaks memory like crazy! https://github.com/deephealthproject/eddl/blob/02e37c0dfb674468495f4d0ae3c159de3b2d3cc0/src/serialization/onnx/eddl_onnx_import.cpp#L612 Why dynamic allocation of local vectors? Nobody is deleting them. Just make them local objects. Or even better, use Tensors directly. Moreover delete is not a function. Remove the parenthesis.

RParedesPalacios commented 4 years ago

Well, calm down, we will take care of this, thanks

Seraphid commented 4 years ago

Hi Constantino,

We will fix this problems with memory allocation. Thanks for the information you provided.

RParedesPalacios commented 4 years ago

Anyway, i will find some time to analyse the ECVL code ;)

salvacarrion commented 4 years ago

My two cents: Valgrind

CostantinoGrana commented 4 years ago

Anyway, i will find some time to analyse the ECVL code ;)

Please do it! I'm sure you will find many bugs...

RParedesPalacios commented 4 years ago

Anyway, i will find some time to analyse the ECVL code ;)

Please do it! I'm sure you will find many bugs...

i am quite confident about you are doing