zhegexiaohuozi / SeimiCrawler

一个简单、敏捷、分布式的支持SpringBoot的Java爬虫框架;An agile, distributed crawler framework.
http://seimicrawler.org
Apache License 2.0
1.98k stars 679 forks source link

使用代理过程中报错 #55

Open liuyu-struggle opened 4 years ago

liuyu-struggle commented 4 years ago

使用代理过程中出现了与问题37出现的问题一样 #37

java.lang.ClassCastException: org.apache.http.message.BasicHttpRequest cannot be cast to org.apache.http.client.methods.HttpUriRequest at cn.wanghaomiao.seimi.http.hc.HcDownloader.getRealUrl(HcDownloader.java:180) ~[SeimiCrawler-2.0.jar:na] at cn.wanghaomiao.seimi.http.hc.HcDownloader.renderResponse(HcDownloader.java:117) ~[SeimiCrawler-2.0.jar:na] at cn.wanghaomiao.seimi.http.hc.HcDownloader.process(HcDownloader.java:79) ~[SeimiCrawler-2.0.jar:na] at cn.wanghaomiao.seimi.core.SeimiProcessor.run(SeimiProcessor.java:101) ~[SeimiCrawler-2.0.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]

但是在37问题中看到说使用okHttp就可以了,不使用默认的HttpClient,我在文档中也找到了使用OkHttp的办法,官方文档关于使用Okhttp链接。

更换为 okHttp之后上述问题的确没有了,但是出现了另一个问题,就是使用代理访问的返回结果response里面的content内容出现乱码,暂时不知如何解决。求大神解答!!!