introlab / find-object

Find-Object project
http://introlab.github.io/find-object/
BSD 3-Clause "New" or "Revised" License
448 stars 189 forks source link

Run SIFT with KdTree slow for large dataset #65

Open dinhnn2103 opened 6 years ago

dinhnn2103 commented 6 years ago

Hi matlabe,

I am using your find object to recognite with my test dataset (~ 25000 images, contains duplicate images)

Beginning train

before

After some train

after

I train with SIFT descriptor and KDTree as nearest neighbor strategy. Find object train my dataset seem slowly when more objects added. This is very bad because my real dataset will have more than 25000 images.

I searched and find out that because flann kdtree in opencv speed is not good https://github.com/opencv/opencv/issues/10325

They suggest use nanoflann can improve speed. https://github.com/jlblancoc/nanoflann

Can you help me? Thanks

matlabbe commented 6 years ago

This "lag" is caused by the reconstruction of the vocabulary tree, see the General/vocabularyUpdateMinWords parameter: screenshot_2018-08-29_17-06-56 You can increase it to 10 000 or even 100 000 and see the difference.

For the memory allocation in FLANN, there were some fixes that have been integrated in rtabmap's flann version for android, see https://github.com/introlab/rtabmap/blob/master/corelib/src/rtflann/algorithms/kdtree_index.h#L317-L658. The search functions are allocating the heap only one time instead for each search, though it only works when parallelization is disabled. On desktop computer, I didn't see really a big issue with the memory allocation, so the original parallel version is used.

nanoflann seems indeed updated recently, it could be indeed integrated, as an option?

cheers, Mathieu

nhudinh2103 commented 6 years ago

Hi matlabe,

Thanks for your quick and detail response

I have changed General/vocabularyUpdateMinWords from 2000 -> 16384. This will increase speed but after run some time, speed runtime suddenly increase.

Here's image (this use 2000 but 16384 have same result after reach ~ 6,600,000 words) bug-memory

matlabbe commented 6 years ago

If the latency happens always after 6,600,000 words, it could be that your computer is lacking of RAM, so that memory on the SWAP (hard drive) is used. The update still seem happening after 2000 words and not 16384, is this latest find-object built from source (it seems okay on my version when I increase General/vocabularyUpdateMinWords)?

Another approach would be to use a fixed vocabulary. 6 000 000 words is quite a huge vocabulary. You may stop there (or even smaller), set General/vocabularyFixed to true, then export the vocabulary (File->Save vocabulary...) in binary format to avoid large ascii vocabulary file (binary format is only available with this latest commit).

dinhnn2103 commented 6 years ago

Hi matlabe,

Thanks for your suggest, but now I have another problem. After add my dataset to find_object and create session, find_object can not detect image from scene when request.

Steps to reproduce:

  1. Run find_object as console to train my dataset with SIFT detector/descriptor (find_object UI not work because too many images) ./find_object --console --objects <myDatasetPath> --session_new my_session.bin

  2. After run ~ 25,000 image above, stop find_object (ctr-c), find_object automatically save my_session.bin

  3. Reload find_object with session ./find_object --console --debug --session my_session.bin

  4. Run tcpRequest from find_object bin directory

    Result: find_object will stuck in "Do Nearest Neighbor" and then throw error

find_object_server_error

I try debugging a little bit and find out that knnSearch() method in Vocabulary:461 make program stuck.

flannIndex_.knnSearch(descriptors, results, dists, k,
        cv::flann::SearchParams(
        Settings::getNearestNeighbor_search_checks(),
        Settings::getNearestNeighbor_search_eps(),
        Settings::getNearestNeighbor_search_sorted()));

Can you help me? Thanks

matlabbe commented 6 years ago

I revised how arguments are handled when starting find-object. One problem is that changing parameters in a loaded session was not possible, so we couldn't change to a fixed vocabulary with a session created from a not fixed vocabulary. Update code to latest version and now it is possible to do something like this:

./find_object --console --session_new session.bin --objects ~/objs --General/vocabularyIncremental true We are creating a new session with default parameters (+ custom incremental vocabulary) using some objects.

./find_object --console --session session.bin --objects ~/objs2 --General/vocabularyFixed true We reload the session but we fix the vocabulary, and we add more objects.

./find_object --console --session session.bin --scene ~/scene.jpg --json results.json We reload the full session and feed a scene to detect objects.

Can you try with the latest version?

nhudinh2103 commented 6 years ago

I am trying your suggest above.

Can you add flann library from your rtabmap to find_object (use this instead of flann OpenCV) https://github.com/introlab/rtabmap/tree/master/corelib/src/rtflann

I think there can be some problem with flann openCV (not update for a long time).

matlabbe commented 6 years ago

It is like the flann version included in OpenCV has already been updated over the past years: https://github.com/opencv/opencv/commits/master/modules/flann/include/opencv2/flann while the official flann project has too: https://github.com/mariusmuja/flann/commits/master

Not sure if some fixes of the official library has been already pushed back to OpenCV version. The version of rtabmap is older than both those versions. Not sure it would fix your problems. To better debug the flann error you got in your previous post, build find-object in debug mode (cmake -DCMAKE_BUILD_TYPE=Debug) as well as OpenCV. Then launch find-object in gdb:

gdb --args find_object --console ...

type "bt" when it crashes to get the call stack.

Also, if you are able to reproduce the problem with a small subset of images, you may share this dataset so it will be easier for us to debug and reproduce the problem.

dinhnn2103 commented 6 years ago

Matlabe, do you have email? I will share you small subset of images if I can

Here's the stacktrace:

#0  0x00007f671c0a1277 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f671c0a2968 in __GI_abort () at abort.c:90
#2  0x00007f671c09a096 in __assert_fail_base (fmt=0x7f671c1f5580 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f6722a34af4 "result.full()", file=file@entry=0x7f6722a34e38 "/home/dinhnn/git/opencv/modules/flann/include/opencv2/flann/kdtree_index.h", line=line@entry=463, function=function@entry=0x7f6722a36c00 <cvflann::KDTreeIndex<cvflann::L2<float> >::getNeighbors(cvflann::ResultSet<float>&, float const*, int, float)::__PRETTY_FUNCTION__> "void cvflann::KDTreeIndex<Distance>::getNeighbors(cvflann::ResultSet<typename Distance::ResultType>&, const ElementType*, int, float) [with Distance = cvflann::L2<float>; typename Distance::ResultType"...) at assert.c:92
#3  0x00007f671c09a142 in __GI___assert_fail (assertion=0x7f6722a34af4 "result.full()", file=0x7f6722a34e38 "/home/dinhnn/git/opencv/modules/flann/include/opencv2/flann/kdtree_index.h", line=463, function=0x7f6722a36c00 <cvflann::KDTreeIndex<cvflann::L2<float> >::getNeighbors(cvflann::ResultSet<float>&, float const*, int, float)::__PRETTY_FUNCTION__> "void cvflann::KDTreeIndex<Distance>::getNeighbors(cvflann::ResultSet<typename Distance::ResultType>&, const ElementType*, int, float) [with Distance = cvflann::L2<float>; typename Distance::ResultType"...)
    at assert.c:101
#4  0x00007f6722a11080 in cvflann::KDTreeIndex<cvflann::L2<float> >::getNeighbors(cvflann::ResultSet<float>&, float const*, int, float) (this=0x21aa4e0, result=..., vec=0x221090000, maxCheck=32, epsError=1)
---Type <return> to continue, or q <return> to quit---
    at /home/dinhnn/git/opencv/modules/flann/include/opencv2/flann/kdtree_index.h:463
#5  0x00007f6722a02e07 in cvflann::KDTreeIndex<cvflann::L2<float> >::findNeighbors(cvflann::ResultSet<float>&, float const*, cvflann::SearchParams const&) (this=0x21aa4e0, result=..., vec=0x221090000, searchParams=...)
    at /home/dinhnn/git/opencv/modules/flann/include/opencv2/flann/kdtree_index.h:213
#6  0x00007f67229ede51 in cvflann::NNIndex<cvflann::L2<float> >::knnSearch(cvflann::Matrix<float> const&, cvflann::Matrix<int>&, cvflann::Matrix<float>&, int, cvflann::SearchParams const&) (this=0x21aa4e0, queries=..., indices=..., dists=..., knn=2, params=...)
    at /home/dinhnn/git/opencv/modules/flann/include/opencv2/flann/nn_index.h:86
#7  0x00007f67229ead09 in cvflann::Index<cvflann::L2<float> >::knnSearch(cvflann::Matrix<float> const&, cvflann::Matrix<int>&, cvflann::Matrix<float>&, int, cvflann::SearchParams const&) (this=0x22f9310, queries=..., indices=..., dists=..., knn=2, params=...)
    at /home/dinhnn/git/opencv/modules/flann/include/opencv2/flann/flann_base.hpp:218
#8  0x00007f67229e66b8 in cv::flann::runKnnSearch_<cvflann::L2<float>, cvflann::Index<cvflann::L2<float> > >(void*, cv::Mat const&, cv::Mat&, cv::Mat&, int, cv::flann::SearchParams const&) (index=0x22f9310, query=..., indices=..., dists=..., knn=2, params=...)
---Type <return> to continue, or q <return> to quit---
    at /home/dinhnn/git/opencv/modules/flann/src/miniflann.cpp:495
#9  0x00007f67229e3574 in cv::flann::runKnnSearch<cvflann::L2<float> >(void*, cv::Mat const&, cv::Mat&, cv::Mat&, int, cv::flann::SearchParams const&) (index=0x22f9310, query=..., indices=..., dists=..., knn=2, params=...)
    at /home/dinhnn/git/opencv/modules/flann/src/miniflann.cpp:503
#10 0x00007f67229e0914 in cv::flann::Index::knnSearch(cv::_InputArray const&, cv::_OutputArray const&, cv::_OutputArray const&, int, cv::flann::SearchParams const&) (this=0x1f47f08, _query=..., _indices=..., _dists=..., knn=2, params=...)
    at /home/dinhnn/git/opencv/modules/flann/src/miniflann.cpp:586
#11 0x00007f675cca94ae in find_object::Vocabulary::search(cv::Mat const&, cv::Mat&, cv::Mat&, int) (this=0x1f47f00, descriptorsIn=..., results=..., dists=..., k=2) at /home/dinhnn/git/find-object/src/Vocabulary.cpp:465
#12 0x00007f675cc8cc2c in find_object::FindObject::detect(cv::Mat const&, find_object::DetectionInfo&) const (this=0x2306870, image=..., info=...)
    at /home/dinhnn/git/find-object/src/FindObject.cpp:1502
#13 0x000000000041b56e in FindObjectWorker::detect(cv::Mat const&) (this=0x230e390, image=...)
    at /home/dinhnn/git/find-object/build/app/../../app/TcpServerPool.h:43
#14 0x000000000041af81 in FindObjectWorker::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_o=0x230e390, _c=QMetaObject::InvokeMetaMethod, _id=1, _a=0x7f66e6f983d0)
    at /home/dinhnn/git/find-object/build/app/moc_TcpServerPool.cpp:93
#15 0x00007f675b9b8237 in QMetaObject::activate(QObject*, int, int, void**) ()
---Type <return> to continue, or q <return> to quit---
    at /lib64/libQt5Core.so.5
#16 0x00007f675ccdd559 in find_object::TcpServer::detectObject(cv::Mat const&) (this=0x21ed55800, _t1=...)
    at /home/dinhnn/git/find-object/build/src/__/include/find_object/moc_TcpServer.cpp:197
#17 0x00007f675cca5b7f in find_object::TcpServer::readReceivedData() (this=0x21ed55800) at /home/dinhnn/git/find-object/src/TcpServer.cpp:167
#18 0x00007f675ccdd241 in find_object::TcpServer::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_o=0x21ed55800, _c=QMetaObject::InvokeMetaMethod, _id=5, _a=0x7f66e6f98670)
    at /home/dinhnn/git/find-object/build/src/__/include/find_object/moc_TcpServer.cpp:111
#19 0x00007f675b9b8237 in QMetaObject::activate(QObject*, int, int, void**) ()
    at /lib64/libQt5Core.so.5
#20 0x00007f675d0b5503 in QAbstractSocketPrivate::emitReadyRead(int) ()
    at /lib64/libQt5Network.so.5
#21 0x00007f675d0b5598 in QAbstractSocketPrivate::canReadNotification() ()
    at /lib64/libQt5Network.so.5
#22 0x00007f675d0c65b1 in QReadNotifier::event(QEvent*) ()
    at /lib64/libQt5Network.so.5
#23 0x00007f675b990205 in doNotify(QObject*, QEvent*) ()
    at /lib64/libQt5Core.so.5
#24 0x00007f675b9902e6 in QCoreApplication::notifyInternal2(QObject*, QEvent*) (---Type <return> to continue, or q <return> to quit---
) at /lib64/libQt5Core.so.5
#25 0x00007f675b9df590 in socketNotifierSourceDispatch(_GSource*, int (*)(void*), void*) () at /lib64/libQt5Core.so.5
#26 0x00007f671857b969 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#27 0x00007f671857bcc8 in g_main_context_iterate.isra.22 ()
    at /lib64/libglib-2.0.so.0
#28 0x00007f671857bd7c in g_main_context_iteration ()
    at /lib64/libglib-2.0.so.0
#29 0x00007f675b9deabc in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /lib64/libQt5Core.so.5
#30 0x00007f675b98edeb in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /lib64/libQt5Core.so.5
#31 0x00007f675b7e26c8 in QThread::exec() () at /lib64/libQt5Core.so.5
#32 0x00007f675b7e6b71 in QThreadPrivate::start(void*) ()
    at /lib64/libQt5Core.so.5
#33 0x00007f671be56e25 in start_thread (arg=0x7f66e6f99700)
    at pthread_create.c:308
#34 0x00007f671c169bad in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
matlabbe commented 6 years ago

I fixed the problem of your session_bug_reproduce.zip example in this commit. The feature detector was not updated, thus the features in the scene were not the same type than those of the object in the saved session.

For the error above, if you have also a subset of images that can reproduce the problem it could be nice. My gmail is the same than my username here if you don't want to share images publicly.

cheers, Mathieu