Just curious, since for fKMeans there are a couple of ways to calculate
similarity measure of vectors, it is possible to expose this functionality
through API / Script / binary
Would expect the ability to do something like this
$GRAPHLAB_SRC/release/demoapps/glsimilarity \
# the query vector
--query=$query \
# the collection matrix, may be documents, resources, clusters or whatever
--collection=$collection_mtx \
#8 is cosine similarity measure?
--distance_metric=8 \
# input and output in matrix market format
--matrixmarket=true \
--ncpus=3
where --query the query vector
--collection is a matrix market file describing all available
documents/resources
The expected output would be a vector storing the relationship between the
query and each of the document defined in --collection parameter.
Another alternative to this is just exposing this to API so that I can send 2
vectors for comparison at one single time. I can always use some concurrency
tool like gearman to split the calculation into multiple processes to speed up
the calculation.
$GRAPHLAB_SRC/release/demoapps/glsimilarity \
#input that consists 2 vectors
--input=$input_mtx
#8 is cosine similarity measure?
--distance_metric=8 \
# input and output in matrix market format
--matrixmarket=true \
--ncpus=3
Original issue reported on code.google.com by jeffre...@gmail.com on 3 Oct 2011 at 8:45
Original issue reported on code.google.com by
jeffre...@gmail.com
on 3 Oct 2011 at 8:45