What steps will reproduce the problem?
1.controller.addSeed("http://www.tudou.com/"); Tudou is a video site in China
whose html page charset is GBK.
2.set "crawler.default_encoding=gbk" in file crawler4j.properties.
What is the expected output? What do you see instead?
Expected output:
The crawler runs well.
What i got:
error message:
ERROR [main] Error while fetching http://www.tudou.com/robots.txt
ERROR [Crawler 1] Error while fetching http://www.tudou.com/
What version of the product are you using? On what operating system?
Version 2.6.1
OS: Linux X 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11 03:49:04 UTC 2011 x86_64
x86_64 x86_64 GNU/Linux
Please provide any additional information below.
How can i solve this problem?
Original issue reported on code.google.com by wenlei.z...@gmail.com on 21 Nov 2011 at 6:02
Original issue reported on code.google.com by
wenlei.z...@gmail.com
on 21 Nov 2011 at 6:02