ad-si / Textalyzer

Analyze key metrics like number of words, readability, complexity, etc. of any kind of text
56 stars 5 forks source link

Detect text duplication #7

Open ad-si opened 7 months ago

ad-si commented 7 months ago
  1. Shrink all whitespace on each line to 1 space
  2. Create hash of each line
  3. Find longest matching hash string (allowing x hashes to be different)