keerthanashanmugam / graphlabapi

Automatically exported from code.google.com/p/graphlabapi
1 stars 0 forks source link

Calculating Similarity Measure #39

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Just curious, since for fKMeans there are a couple of ways to calculate 
similarity measure of vectors, it is possible to expose this functionality 
through API / Script / binary

Would expect the ability to do something like this
    $GRAPHLAB_SRC/release/demoapps/glsimilarity \
        # the query vector
        --query=$query \
        # the collection matrix, may be documents, resources, clusters or whatever
        --collection=$collection_mtx \
        #8 is cosine similarity measure?
        --distance_metric=8 \
        # input and output in matrix market format
        --matrixmarket=true \
        --ncpus=3

where --query the query vector
--collection is a matrix market file describing all available 
documents/resources

The expected output would be a vector storing the relationship between the 
query and each of the document defined in --collection parameter.

Another alternative to this is just exposing this to API so that I can send 2 
vectors for comparison at one single time. I can always use some concurrency 
tool like gearman to split the calculation into multiple processes to speed up 
the calculation.

    $GRAPHLAB_SRC/release/demoapps/glsimilarity \
        #input that consists 2 vectors
        --input=$input_mtx
        #8 is cosine similarity measure?
        --distance_metric=8 \
        # input and output in matrix market format
        --matrixmarket=true \
        --ncpus=3

Original issue reported on code.google.com by jeffre...@gmail.com on 3 Oct 2011 at 8:45

GoogleCodeExporter commented 8 years ago
Erm... how do I change the type of this issue to Enhancement?!

Original comment by jeffre...@gmail.com on 3 Oct 2011 at 8:46