Closed hhaensel closed 1 year ago
Will be very useful, please submit a PR, preferably with some tests and docs.
@aviks Where's the best location to put it, utils.jl
or tf_idf.jl
or shall I include a new file similarity.jl
?
Tf-idf.jl would be best, I think
I needed the calculation of cosine similarity. My first attempt was a bare implementation of a wikpedia article. But I found out, that this was not as fast as desired (approx. 60s). Finally, I found a way to improve speed by three orders of magnitude by applying a matrix algorithm. If I did my maths correctly, the following function does the job:
In case that some people find it useful, I'd be happy to submit a PR.