Closed Ayushk4 closed 4 years ago
I have ported BM25 and Co-Occurrence Matrix from StringAnalysis.jl. Co-Occurrence Matrix works 10-15x faster than one in #164, uses less space, supports operations over Document and Corpus types.
LSA has been fixed. ROUGE - N has been re-implemented, supports languages, 15 - 20% improvement in speed and memory.
Tests, docstrings, online documentation added for all these.
@aviks, please review.
I've fixed merge conflicts, and added explicit license. attribution to zgornel
in the coom.jl
I am porting various implementations from StringAnalysis.jl and fixing various others.
[X] Co-Occurrence Matrix
[X] BM25
[X] Speeding up Rouge.jl
[X] Docstrings and Docs for Evaluation Metrics (Rouge)
[X] Fixing
lsa
[X] Docs and tests for
lsa
As per the discussions in #164 , I am preferring to port COOM from StringAnalysis.jl for various advantages discussed.
There seem to be performance bottlenecks in rouge.jl due to Abstract containers, this also needs to be worked upon.