lapplislazuli / Hopinosis

Opinosis Implementation in Haskell
MIT License
0 stars 0 forks source link

Sentence-Similiarity #14

Closed lapplislazuli closed 4 years ago

lapplislazuli commented 4 years ago

To pick most redundant, but also distinct sentences, somehow I need to compare every sentence to every already chosen sentence.

Proposed Solution There should be atleast one function which gets the distance of two paths/sentences.

Then there should be a function which somewhat weights the metric-score with the distance to already chosen sentences.

Possible Problems: Maybe it's hard to make a nice, functional solution for it.

Related Issues: This is a subtask for #11

Additional Context: There are many ways to compare sentence-similiarity. One Example Article

lapplislazuli commented 4 years ago

Jaccard Similarity would be an easy one, but seems to be rather weak. It is:

Size(Intersection of words) / Size(Union of Words)

lapplislazuli commented 4 years ago

Cosine Similiarity https://medium.com/@sumn2u/cosine-similarity-between-two-sentences-8f6630b0ebb7 would be much better.

Example behind Paywall

Example without Paywall