betito / smartfind

Automatically exported from code.google.com/p/smartfind
0 stars 0 forks source link

Text splitting does not work well in some cases #2

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Go to www.google.com (Google in English)
2. Press ctrl+F and search for 'adverstising'

The most similar word returned is 'ToolsAdvertising', which actually are
the two words 'Tools' and 'Advertising' separated by <br> HTML tag, instead
of space or any other punctuations. The text split function should handle
those cases.

Original issue reported on code.google.com by tnol...@gmail.com on 26 Jun 2008 at 7:09

GoogleCodeExporter commented 9 years ago
our split does not filter some tags (<script> any fuck <script/>) content ... 
our
regexp still has to improved.

Original comment by tonikitoo@gmail.com on 27 Jun 2008 at 4:35