danielvarga / hunglish-webapp

Automatically exported from code.google.com/p/hunglish-webapp
0 stars 0 forks source link

improve quality filter: same text on both sides #69

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
e.g.: search for risque 
http://hunglish.hu/search?huSentence=&enSentence=risque&doc.genre=-10

TODO:
If the sentence is more or less same on both sides then filter them out.
Hint: Use the hash function used in duplicate filter to implement this!

Original issue reported on code.google.com by bpgergo on 6 Jun 2011 at 1:22