Effective similarity matrix computation

It would be nice to have a method that given two lists of terms N and M returns an N*M similarity matrix for those terms. First, this representation seems to be generic and has many use cases. Second, batch similarity computation can be optimized to achieve better-than-naive performance and minimize reads. In the case of VSMs it can be even implemented as a matrix operation instead of pairwise vector similarities. The mtj library used in dkpro-similarity is powered by BLAS, so it should be possible to perform basic linear algebra operations really fast.

dkpro / dkpro-similarity

Effective similarity matrix computation #47