flann-lib / flann

Fast Library for Approximate Nearest Neighbors
http://people.cs.ubc.ca/~mariusm/flann
Other
2.23k stars 650 forks source link

Will you write thread safe version of PooledAllocator::allocateMemory #129

Open choni opened 11 years ago

choni commented 11 years ago

Hello, Marius.

I have index building time problems. I use HierarchicalClusteringIndex.

And, I tried multi thread indexing. It is PooledAllocator::allocateMemory() then to have obstacled.

So, I added a mutex to top of PooledAllocator::allocateMemory. And, devided HierarchicalClusteringIndex::buildIndex() to threads by trees.

Results are: data size = 512bits * 79077820 original building time = more than 30 minutes. (MacBook Pro 2.7 GHz Intel Core i7) multi thread version = about 9 minutes.

I used boost library for multi threads and my codes have not versatility, so I hesitate to send pull request. Will you write thread safe version of PooledAllocator::allocateMemory and multi thread indexing.

Changes is below: --- allocator.h PooledAllocator::allocateMemory()

boost::mutex mtxAllocate;  // declare mutex
/**
 * Returns a pointer to a piece of new memory of the given size in bytes
 * allocated from the pool.
 */
void* allocateMemory(int size)
{
boost::mutex::scoped_lock lock(mtxAllocate); // only this line added in this method
    int blocksize;
    .
    .
    .

}

--- HierarchicalClusteringIndex::buildIndex()

void buildTreeIndex(int treeIdx) {
    int i = treeIdx;

if (indices[i] != NULL) {
    delete [] indices[i];
}
indices[i] = new int[size_];
for (size_t j=0; j<size_; ++j) {
    indices[i][j] = j;
}
root[i] = pool.allocate<Node>();
computeClustering(root[i], indices[i], size_, branching_,0);

}

/**
 * Builds the index
 */
void buildIndex()
{
    if (branching_<2) {
        throw FLANNException("Branching factor must be at least 2");
    }

    boost::thread_group treeThreads;
    for (int i=0; i<trees_; ++i) {
        treeThreads.create_thread(bind(&HierarchicalClusteringIndex::buildTreeIndex, this, i));
    }
    treeThreads.join_all();
}
mariusmuja commented 11 years ago

Hi,

Having the option of muti-threaded index building is a good idea. However, I'd like to not have to add boost as a dependency to the FLANN core library because FLANN is used by other libraries that don't necessarily depend on Boost (OpenCV being one example). An option would be to use OpenMP instead of Boost threads, as FLANN already uses OpenMP for search.

I'll add this to my TODO list, however patches are welcome.

Cheers, Marius

choni commented 11 years ago

Hello. Thanks, for your reply.

I divided indexing to threads by trees, cause HierarchicalClusteringIndex::computeClustering contains some "for" clauses that is NOT thread safe, even after changed PooledAllocator::allocateMemory() to be thread safe.

But, dividing by trees restricts number of threads.

How do you think about it ?

Regards, SHIINA