Open davidt0x opened 12 years ago
Hi,
On Fri, Mar 2, 2012 at 11:42 AM, davidt0x < reply@reply.github.com
wrote:
Hey,
First off, thanks for a great library! You guys are providing an invaluable resource so please keep it.
That being said, I just want to bring to your attention a bug that I believe I found in the matlab interface. While using the mex interface I kept getting seg faults that would seemingly occur at random. It seemed like they would occur more frequently when working with larger amounts of data. I believe I have tracked the problem down to the improper use of pointers when building or loading and index. The functions _build_index and _load_index within the nearest_neighbors.cpp code both pass a pointer of the dataset to the API when constructing and index. However, I see that the FLANN code does not perform a copy of the dataset when constructing the index. This was probably intended behavior I imagine. However, this causes problems when trying to make the index persistent accross mex calls because that dataset memory can be deallocated after the flann_load_index or flann_build_index calls are completely. This is because the dataset memory is allocated for the arguments of the se mex calls and after they are done will most likely be cleaned up by the memory manager after they finish.
Yes, the index does not make a copy of the dataset and that is indeed the intended behavior. The reason for that is that in case of large datasets it's not desirable to create a copy and use twice the memory (in case of very large datasets there's not even enough memory on the machine to do that).
I hope all that makes sense. The easiest way that I found to fix this is to modify nearest_neighbors.cpp such that it makes a copy of the dataset passed in via the input arguments to mexFunction. This copy must also be kept track of so that it can be free'd when flann_free_index is called. I have implemented this myself and it seems to resolve all segfaults that I was encountering. There is probably a better fix.
I just wanted to let you all know of the problem. As I said, this bug might be hard to reproduce as it doesn't always cause seg faults. I did not have any problems until I moved to larger datasets.
I think that if a reference is kept in Matlab to the dataset being used, Matlab's memory manager would not deallocate the memory. I'll look into the possibility of preventing the deallocation from the MEX call.
Thanks, Marius
Hey Marius,
Thanks for getting back to me. Yes, I figured that is why you guys were not copying the data. It makes the most sense in C but becomes annoying in matlab for making things persistent accross mex calls because you have the matlab memory manager to deal with. To make the memory allocated for the arguments persistent you could try:
http://www.mathworks.com/help/techdoc/apiref/mexmakememorypersistent.html http://www.mathworks.com/help/techdoc/apiref/mexmakearraypersistent.html
This would seem to be better than copying it.
I believe I was running into this problem even when I kept a reference of the dataset in memory around but I could definitely be mistaken. My code was calling a function that created an index. The function accepted a matrix from which the dataset is created and derived. The dataset is then used to build the index and both are returned from the function. Maybe the reference was being lost because matlab is making copies or something. As I said, it was kind of annoying to reproduce the bug, I just wanted to give you guys a heads up that it is possible if you didn't already know. Let me know if I can help at all and once again thanks for the great library.
Dave
On Tue, Mar 6, 2012 at 10:50 PM, Marius Muja < reply@reply.github.com
wrote:
Hi,
On Fri, Mar 2, 2012 at 11:42 AM, davidt0x < reply@reply.github.com
wrote:
Hey,
First off, thanks for a great library! You guys are providing an invaluable resource so please keep it.
That being said, I just want to bring to your attention a bug that I believe I found in the matlab interface. While using the mex interface I kept getting seg faults that would seemingly occur at random. It seemed like they would occur more frequently when working with larger amounts of data. I believe I have tracked the problem down to the improper use of pointers when building or loading and index. The functions _build_index and _load_index within the nearest_neighbors.cpp code both pass a pointer of the dataset to the API when constructing and index. However, I see that the FLANN code does not perform a copy of the dataset when constructing the index. This was probably intended behavior I imagine. However, this causes problems when trying to make the index persistent accross mex calls because that dataset memory can be deallocated after the flann_load_index or flann_build_index calls are completely. This is because the dataset memory is allocated for the arguments of the se mex calls and after they are done will most likely be cleaned up by the memory manager after they finish.
Yes, the index does not make a copy of the dataset and that is indeed the intended behavior. The reason for that is that in case of large datasets it's not desirable to create a copy and use twice the memory (in case of very large datasets there's not even enough memory on the machine to do that).
I hope all that makes sense. The easiest way that I found to fix this is to modify nearest_neighbors.cpp such that it makes a copy of the dataset passed in via the input arguments to mexFunction. This copy must also be kept track of so that it can be free'd when flann_free_index is called. I have implemented this myself and it seems to resolve all segfaults that I was encountering. There is probably a better fix.
I just wanted to let you all know of the problem. As I said, this bug might be hard to reproduce as it doesn't always cause seg faults. I did not have any problems until I moved to larger datasets.
I think that if a reference is kept in Matlab to the dataset being used, Matlab's memory manager would not deallocate the memory. I'll look into the possibility of preventing the deallocation from the MEX call.
Thanks, Marius
Reply to this email directly or view it on GitHub: https://github.com/mariusmuja/flann/issues/32#issuecomment-4361837
On Tue, Mar 6, 2012 at 8:08 PM, davidt0x < reply@reply.github.com
wrote:
Hey Marius,
Thanks for getting back to me. Yes, I figured that is why you guys were not copying the data. It makes the most sense in C but becomes annoying in matlab for making things persistent accross mex calls because you have the matlab memory manager to deal with. To make the memory allocated for the arguments persistent you could try:
http://www.mathworks.com/help/techdoc/apiref/mexmakememorypersistent.html http://www.mathworks.com/help/techdoc/apiref/mexmakearraypersistent.html
This would seem to be better than copying it.
Yes, it seems that this should solve the problem. I will add this to the matlab mex file.
I believe I was running into this problem even when I kept a reference of the dataset in memory around but I could definitely be mistaken. My code was calling a function that created an index. The function accepted a matrix from which the dataset is created and derived. The dataset is then used to build the index and both are returned from the function. Maybe the reference was being lost because matlab is making copies or something. As I said, it was kind of annoying to reproduce the bug, I just wanted to give you guys a heads up that it is possible if you didn't already know. Let me know if I can help at all and once again thanks for the great library.
Yes, I believe you have correctly identified the problem, Matlab is using pass-by-value semantics when passing arguments to a function and it's making a copy of the dataset which it eventually deallocates causing the problems you have experienced.
Thanks for the bug report, Marius
Hi, did this problem ever get fixed in the master branch? I seem to be running into some segmentation violation problems which cause Matlab to crash while using flann_search. I'm not sure if it's a related problem.
I have tried running my code on both Win7 and OSX with the same problem.
No, it seems this issue has slipped under my radar. I just pushed a change to the master branch that should fix this (by making a copy of the dataset when building the index). Give it a shot and see if it fixes the segfaults you are experiencing.
Seems to have fixed it for me. Thanks for the quick response and great library!
EDIT: It appears I spoke too soon. I added a few more datasets, and the error has returned when a dataset of ~13000x128 is used with flann_search.
Can you post a stack trace or, even better, a minimal example that reproduces the crash?
The following code produces and subsequently processes random data arrays with the same dimensions I am currently working with. A segmentation error should be produced. I am currently using Matlab R2009a.
EDIT: I should mention I have also tried this on Matlab R2012a Unix with identical results.
dimension1 = [
594 2999 90 562 226 192
582 4282 144 699 332 188
257 1062 1026 277 161 658
267 947 1112 311 151 361
242 418 559 199 720 304
182 526 749 177 687 323
915 404 382 713 438 185
785 251 434 526 378 253
375 157 554 372 234 694
370 179 372 400 192 615
169 2234 287 224 653 264
283 2362 172 223 662 269
642 667 72 796 284 173
637 941 149 460 240 171
385 531 1961 562 101 641
181 588 2052 348 193 367
82 289 1288 182 411 294
145 310 897 290 618 249
13034 222 715 634 239 232
11548 111 527 521 294 226
];
dimension2 = 128;
dimension1_test = [546 666 471 613 370 553];
for i = 1:20
data(i).a = rand([dimension1(i,1),dimension2]);
data(i).b = rand([dimension1(i,2),dimension2]);
data(i).c = rand([dimension1(i,3),dimension2]);
data(i).d = rand([dimension1(i,4),dimension2]);
data(i).e = rand([dimension1(i,5),dimension2]);
data(i).f = rand([dimension1(i,6),dimension2]);
end
data_test.a = rand([dimension1_test(1),dimension2]);
data_test.b = rand([dimension1_test(2),dimension2]);
data_test.c = rand([dimension1_test(3),dimension2]);
data_test.d = rand([dimension1_test(4),dimension2]);
data_test.e = rand([dimension1_test(5),dimension2]);
data_test.f = rand([dimension1_test(6),dimension2]);
build_params.target_precision = .9;
build_params.build_weight = .01;
build_params.memory_weight = 0;
build_params.algorithm = 'kmeans';
for i = 1:20
[index(i).a parameters] = flann_build_index( data(i).a,build_params );
[index(i).b parameters] = flann_build_index( data(i).b,build_params );
[index(i).c parameters] = flann_build_index( data(i).c,build_params );
[index(i).d parameters] = flann_build_index( data(i).d,build_params );
[index(i).e parameters] = flann_build_index( data(i).e,build_params );
[index(i).f parameters] = flann_build_index( data(i).f,build_params );
end
for i = 1:20
[results(i).a distances(i).a] = flann_search(index(i).a,data_test.a,1,parameters);
[results(i).b distances(i).b] = flann_search(index(i).b,data_test.b,1,parameters);
[results(i).c distances(i).c] = flann_search(index(i).c,data_test.c,1,parameters);
[results(i).d distances(i).d] = flann_search(index(i).d,data_test.d,1,parameters);
[results(i).e distances(i).e] = flann_search(index(i).e,data_test.e,1,parameters);
[results(i).f distances(i).f] = flann_search(index(i).f,data_test.f,1,parameters);
end
Thanks for all the help!
Seems like in version 1.8.4 bug still unfixed.
EDIT. Seems to work! If somebody else will be stuck with segfaults I will live the note how i fixed it. When I call
index_ = flann_load_index(indexPath, dataset);
if I used indexPath as relative path
indexPath = '~/mydir/index.bin';
then I got SEGFAULT. Using absolute path solves the problem!
indexPath = '/home/username/mydir/index.bin';
Hey,
First off, thanks for a great library! You guys are providing an invaluable resource so please keep it.
That being said, I just want to bring to your attention a bug that I believe I found in the matlab interface. While using the mex interface I kept getting seg faults that would seemingly occur at random. It seemed like they would occur more frequently when working with larger amounts of data. I believe I have tracked the problem down to the improper use of pointers when building or loading and index. The functions _build_index and _load_index within the nearest_neighbors.cpp code both pass a pointer of the dataset to the API when constructing and index. However, I see that the FLANN code does not perform a copy of the dataset when constructing the index. This was probably intended behavior I imagine. However, this causes problems when trying to make the index persistent accross mex calls because that dataset memory can be deallocated after the flann_load_index or flann_build_index calls are completely. This is because the dataset memory is allocated for the arguments of these mex calls and after they are done will most likely be cleaned up by the memory manager after they finish.
I hope all that makes sense. The easiest way that I found to fix this is to modify nearest_neighbors.cpp such that it makes a copy of the dataset passed in via the input arguments to mexFunction. This copy must also be kept track of so that it can be free'd when flann_free_index is called. I have implemented this myself and it seems to resolve all segfaults that I was encountering. There is probably a better fix.
I just wanted to let you all know of the problem. As I said, this bug might be hard to reproduce as it doesn't always cause seg faults. I did not have any problems until I moved to larger datasets.
Just to note I am using
Matlab Version 7.12.0.635 (R2011a) 64-bit (glnxa64)
Thanks, Dave