Closed kalemi19 closed 5 years ago
@kalemi19, the tool you linked to reports a cosine similarity for all inputs. So I'm pretty confident my implementation is the correct one ;-)
Correction, the tool you linked compares words while string similarity algorithms usually compare characters
Got it. Makes sense now. Thank you.
Not sure if this gem is still maintained, but the returned cosine similarity for the following two urls is 97%
https://maduradas.com/pena-ajena-la-ridicula-actuacion-estos-gaiteros-chavistas-programa-diosdado-video/
https://maduradas.com/sepalo-ortega-diaz-afirmo-globovision-la-vitalicia-deberan-subastadas-al-restituirse-la-democracia/
Even by looking at the urls you can tell they're far from being the same.
Looking at this online tool https://asecuritysite.com/forensics/simstring, the cosine similarity should be 0
Meanwhile, I'm looking at the underlying algorithm.