taki0112 / Vector_Similarity

Python, Java implementation of TS-SS called from "A Hybrid Geometric Approach for Measuring Similarity Level Among Documents and Document Clustering"
MIT License
294 stars 44 forks source link

I just try python version of ts-ss similarity but found we cannot produce similarity in bulk for matrices of v1(m x n) and v2(m x k). #5

Open aadityachapagain opened 4 years ago

aadityachapagain commented 4 years ago

It need vectorization to support similarity measure for large matrices in bulk. Current approch is quite slow.

aadityachapagain commented 4 years ago

Hello , I just create a pull request containing code which user numpy for vectorized similarity calculation. There already was a code for vectorized calculation using torch. but the main point here is we dont need torch for such a small operation which can be easily done by numpy in much optimized fashion. Where if we need tensor of that similarity matrices we can easily convert it into either torch tensor or tensorflow entity.

I hope you understand. Thanks