flann-lib / flann

Fast Library for Approximate Nearest Neighbors
http://people.cs.ubc.ca/~mariusm/flann
Other
2.25k stars 646 forks source link

MATLAB Bug #32

Open davidt0x opened 12 years ago

davidt0x commented 12 years ago

Hey,

First off, thanks for a great library! You guys are providing an invaluable resource so please keep it.

That being said, I just want to bring to your attention a bug that I believe I found in the matlab interface. While using the mex interface I kept getting seg faults that would seemingly occur at random. It seemed like they would occur more frequently when working with larger amounts of data. I believe I have tracked the problem down to the improper use of pointers when building or loading and index. The functions _build_index and _load_index within the nearest_neighbors.cpp code both pass a pointer of the dataset to the API when constructing and index. However, I see that the FLANN code does not perform a copy of the dataset when constructing the index. This was probably intended behavior I imagine. However, this causes problems when trying to make the index persistent accross mex calls because that dataset memory can be deallocated after the flann_load_index or flann_build_index calls are completely. This is because the dataset memory is allocated for the arguments of these mex calls and after they are done will most likely be cleaned up by the memory manager after they finish.

I hope all that makes sense. The easiest way that I found to fix this is to modify nearest_neighbors.cpp such that it makes a copy of the dataset passed in via the input arguments to mexFunction. This copy must also be kept track of so that it can be free'd when flann_free_index is called. I have implemented this myself and it seems to resolve all segfaults that I was encountering. There is probably a better fix.

I just wanted to let you all know of the problem. As I said, this bug might be hard to reproduce as it doesn't always cause seg faults. I did not have any problems until I moved to larger datasets.

Just to note I am using

Matlab Version 7.12.0.635 (R2011a) 64-bit (glnxa64)

Thanks, Dave

mariusmuja commented 12 years ago

Hi,

On Fri, Mar 2, 2012 at 11:42 AM, davidt0x < reply@reply.github.com

wrote:

Hey,

First off, thanks for a great library! You guys are providing an invaluable resource so please keep it.

That being said, I just want to bring to your attention a bug that I believe I found in the matlab interface. While using the mex interface I kept getting seg faults that would seemingly occur at random. It seemed like they would occur more frequently when working with larger amounts of data. I believe I have tracked the problem down to the improper use of pointers when building or loading and index. The functions _build_index and _load_index within the nearest_neighbors.cpp code both pass a pointer of the dataset to the API when constructing and index. However, I see that the FLANN code does not perform a copy of the dataset when constructing the index. This was probably intended behavior I imagine. However, this causes problems when trying to make the index persistent accross mex calls because that dataset memory can be deallocated after the flann_load_index or flann_build_index calls are completely. This is because the dataset memory is allocated for the arguments of the se mex calls and after they are done will most likely be cleaned up by the memory manager after they finish.

Yes, the index does not make a copy of the dataset and that is indeed the intended behavior. The reason for that is that in case of large datasets it's not desirable to create a copy and use twice the memory (in case of very large datasets there's not even enough memory on the machine to do that).

I hope all that makes sense. The easiest way that I found to fix this is to modify nearest_neighbors.cpp such that it makes a copy of the dataset passed in via the input arguments to mexFunction. This copy must also be kept track of so that it can be free'd when flann_free_index is called. I have implemented this myself and it seems to resolve all segfaults that I was encountering. There is probably a better fix.

I just wanted to let you all know of the problem. As I said, this bug might be hard to reproduce as it doesn't always cause seg faults. I did not have any problems until I moved to larger datasets.

I think that if a reference is kept in Matlab to the dataset being used, Matlab's memory manager would not deallocate the memory. I'll look into the possibility of preventing the deallocation from the MEX call.

Thanks, Marius

davidt0x commented 12 years ago

Hey Marius,

Thanks for getting back to me. Yes, I figured that is why you guys were not copying the data. It makes the most sense in C but becomes annoying in matlab for making things persistent accross mex calls because you have the matlab memory manager to deal with. To make the memory allocated for the arguments persistent you could try:

http://www.mathworks.com/help/techdoc/apiref/mexmakememorypersistent.html http://www.mathworks.com/help/techdoc/apiref/mexmakearraypersistent.html

This would seem to be better than copying it.

I believe I was running into this problem even when I kept a reference of the dataset in memory around but I could definitely be mistaken. My code was calling a function that created an index. The function accepted a matrix from which the dataset is created and derived. The dataset is then used to build the index and both are returned from the function. Maybe the reference was being lost because matlab is making copies or something. As I said, it was kind of annoying to reproduce the bug, I just wanted to give you guys a heads up that it is possible if you didn't already know. Let me know if I can help at all and once again thanks for the great library.

Dave

On Tue, Mar 6, 2012 at 10:50 PM, Marius Muja < reply@reply.github.com

wrote:

Hi,

On Fri, Mar 2, 2012 at 11:42 AM, davidt0x < reply@reply.github.com

wrote:

Hey,

First off, thanks for a great library! You guys are providing an invaluable resource so please keep it.

That being said, I just want to bring to your attention a bug that I believe I found in the matlab interface. While using the mex interface I kept getting seg faults that would seemingly occur at random. It seemed like they would occur more frequently when working with larger amounts of data. I believe I have tracked the problem down to the improper use of pointers when building or loading and index. The functions _build_index and _load_index within the nearest_neighbors.cpp code both pass a pointer of the dataset to the API when constructing and index. However, I see that the FLANN code does not perform a copy of the dataset when constructing the index. This was probably intended behavior I imagine. However, this causes problems when trying to make the index persistent accross mex calls because that dataset memory can be deallocated after the flann_load_index or flann_build_index calls are completely. This is because the dataset memory is allocated for the arguments of the se mex calls and after they are done will most likely be cleaned up by the memory manager after they finish.

Yes, the index does not make a copy of the dataset and that is indeed the intended behavior. The reason for that is that in case of large datasets it's not desirable to create a copy and use twice the memory (in case of very large datasets there's not even enough memory on the machine to do that).

I hope all that makes sense. The easiest way that I found to fix this is to modify nearest_neighbors.cpp such that it makes a copy of the dataset passed in via the input arguments to mexFunction. This copy must also be kept track of so that it can be free'd when flann_free_index is called. I have implemented this myself and it seems to resolve all segfaults that I was encountering. There is probably a better fix.

I just wanted to let you all know of the problem. As I said, this bug might be hard to reproduce as it doesn't always cause seg faults. I did not have any problems until I moved to larger datasets.

I think that if a reference is kept in Matlab to the dataset being used, Matlab's memory manager would not deallocate the memory. I'll look into the possibility of preventing the deallocation from the MEX call.

Thanks, Marius


Reply to this email directly or view it on GitHub: https://github.com/mariusmuja/flann/issues/32#issuecomment-4361837

mariusmuja commented 12 years ago

On Tue, Mar 6, 2012 at 8:08 PM, davidt0x < reply@reply.github.com

wrote:

Hey Marius,

Thanks for getting back to me. Yes, I figured that is why you guys were not copying the data. It makes the most sense in C but becomes annoying in matlab for making things persistent accross mex calls because you have the matlab memory manager to deal with. To make the memory allocated for the arguments persistent you could try:

http://www.mathworks.com/help/techdoc/apiref/mexmakememorypersistent.html http://www.mathworks.com/help/techdoc/apiref/mexmakearraypersistent.html

This would seem to be better than copying it.

Yes, it seems that this should solve the problem. I will add this to the matlab mex file.

I believe I was running into this problem even when I kept a reference of the dataset in memory around but I could definitely be mistaken. My code was calling a function that created an index. The function accepted a matrix from which the dataset is created and derived. The dataset is then used to build the index and both are returned from the function. Maybe the reference was being lost because matlab is making copies or something. As I said, it was kind of annoying to reproduce the bug, I just wanted to give you guys a heads up that it is possible if you didn't already know. Let me know if I can help at all and once again thanks for the great library.

Yes, I believe you have correctly identified the problem, Matlab is using pass-by-value semantics when passing arguments to a function and it's making a copy of the dataset which it eventually deallocates causing the problems you have experienced.

Thanks for the bug report, Marius

MattKang commented 11 years ago

Hi, did this problem ever get fixed in the master branch? I seem to be running into some segmentation violation problems which cause Matlab to crash while using flann_search. I'm not sure if it's a related problem.

I have tried running my code on both Win7 and OSX with the same problem.

mariusmuja commented 11 years ago

No, it seems this issue has slipped under my radar. I just pushed a change to the master branch that should fix this (by making a copy of the dataset when building the index). Give it a shot and see if it fixes the segfaults you are experiencing.

MattKang commented 11 years ago

Seems to have fixed it for me. Thanks for the quick response and great library!

EDIT: It appears I spoke too soon. I added a few more datasets, and the error has returned when a dataset of ~13000x128 is used with flann_search.

mariusmuja commented 11 years ago

Can you post a stack trace or, even better, a minimal example that reproduces the crash?

MattKang commented 11 years ago

The following code produces and subsequently processes random data arrays with the same dimensions I am currently working with. A segmentation error should be produced. I am currently using Matlab R2009a.

EDIT: I should mention I have also tried this on Matlab R2012a Unix with identical results.

dimension1 = [         
         594        2999          90         562         226         192
         582        4282         144         699         332         188
         257        1062        1026         277         161         658
         267         947        1112         311         151         361
         242         418         559         199         720         304
         182         526         749         177         687         323
         915         404         382         713         438         185
         785         251         434         526         378         253
         375         157         554         372         234         694
         370         179         372         400         192         615
         169        2234         287         224         653         264
         283        2362         172         223         662         269
         642         667          72         796         284         173
         637         941         149         460         240         171
         385         531        1961         562         101         641
         181         588        2052         348         193         367
          82         289        1288         182         411         294
         145         310         897         290         618         249
       13034         222         715         634         239         232
       11548         111         527         521         294         226
];

dimension2 = 128;

dimension1_test = [546 666 471 613 370 553];

for i = 1:20
    data(i).a = rand([dimension1(i,1),dimension2]);
    data(i).b = rand([dimension1(i,2),dimension2]);
    data(i).c = rand([dimension1(i,3),dimension2]);
    data(i).d = rand([dimension1(i,4),dimension2]);
    data(i).e = rand([dimension1(i,5),dimension2]);
    data(i).f = rand([dimension1(i,6),dimension2]);
end

data_test.a = rand([dimension1_test(1),dimension2]);
data_test.b = rand([dimension1_test(2),dimension2]);
data_test.c = rand([dimension1_test(3),dimension2]);
data_test.d = rand([dimension1_test(4),dimension2]);
data_test.e = rand([dimension1_test(5),dimension2]);
data_test.f = rand([dimension1_test(6),dimension2]);

build_params.target_precision = .9;
build_params.build_weight = .01;
build_params.memory_weight = 0;
build_params.algorithm = 'kmeans';

for i = 1:20
    [index(i).a  parameters] = flann_build_index( data(i).a,build_params );
    [index(i).b  parameters] = flann_build_index( data(i).b,build_params );
    [index(i).c  parameters] = flann_build_index( data(i).c,build_params );
    [index(i).d  parameters] = flann_build_index( data(i).d,build_params );
    [index(i).e  parameters] = flann_build_index( data(i).e,build_params );
    [index(i).f  parameters] = flann_build_index( data(i).f,build_params );
end

for i = 1:20
    [results(i).a  distances(i).a] = flann_search(index(i).a,data_test.a,1,parameters);
    [results(i).b  distances(i).b] = flann_search(index(i).b,data_test.b,1,parameters);
    [results(i).c  distances(i).c] = flann_search(index(i).c,data_test.c,1,parameters);
    [results(i).d  distances(i).d] = flann_search(index(i).d,data_test.d,1,parameters);
    [results(i).e  distances(i).e] = flann_search(index(i).e,data_test.e,1,parameters);
    [results(i).f  distances(i).f] = flann_search(index(i).f,data_test.f,1,parameters);
end

Thanks for all the help!

asanakoy commented 9 years ago

Seems like in version 1.8.4 bug still unfixed.

EDIT. Seems to work! If somebody else will be stuck with segfaults I will live the note how i fixed it. When I call

index_ = flann_load_index(indexPath, dataset);

if I used indexPath as relative path

indexPath = '~/mydir/index.bin';

then I got SEGFAULT. Using absolute path solves the problem!

indexPath = '/home/username/mydir/index.bin';