dkm05midhra / crawler4j

Automatically exported from code.google.com/p/crawler4j
0 stars 0 forks source link

spacing removed after htmlParseData.getText(); #269

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. set to crawl url http://www.espn.go.com
2. look at the content of htmlParseData.getText();
3. will get something like {MyESPNNFLMLBNBANHLNCAAFNCAAMNASCARWORLD 
CUPGOLFTENNISBOXINGMMAMORE SPORTSINSIDERSNRADIO& MOREespnW& X GAMESFANTASY& 
GAMESWATCH}

What is the expected output? What do you see instead?
on the URL the words are well separated

What version of the product are you using?
latest

Please provide any additional information below.

Original issue reported on code.google.com by a.singh2...@gmail.com on 9 Jul 2014 at 6:47

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by avrah...@gmail.com on 18 Aug 2014 at 3:51