Open openwzdh opened 11 years ago
Hi there!
Currently we're not using concurrent data structures to keep the overhead as low as possible. There's a thread about this on the Mahout mailing list [1] about this very issue.
But you're certainly welcome to modify the code to get it what you want to do. :)
Also, please track my branch [2]. It's a bit more up to date and is where the work is happening.
[1] http://mail-archives.apache.org/mod_mbox/mahout-user/201212.mbox/%3CCALzSx%2BzOMYBod%3DspWgrsf4Cenqzv%3DnSnsALUP%3DRt%3DXQe6e6SVQ%40mail.gmail.com%3E [2] https://github.com/dfilimon/knn
Thank you for your suggestions, Filimon! Concurrent updating the index is more difficult to implement than expected, we temporarily settled down on a workaround method. When samples are added, a background thread builds a new searcher to replace the old one. It costs but it works. We will continue to develop the concurrent version and benchmark it.
When trying to add new reference vectors into the searcher that is doing searches, ConcurrentLinkedDeque is a thread safe alternative to the ArrayList.
java.util.ConcurrentModificationException: null at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819) ~[na:1.7.0_09] at java.util.ArrayList$Itr.next(ArrayList.java:791) ~[na:1.7.0_09] at org.apache.mahout.knn.search.FastProjectionSearch.reindex(FastProjectionSearch.java:180) ~[knn-0.1-SNAPSHOT.jar:na] at org.apache.mahout.knn.search.FastProjectionSearch.search(FastProjectionSearch.java:111) ~[knn-0.1-SNAPSHOT.jar:na]