fanyong920 / jvppeteer

Headless Chrome For Java (Java 爬虫)
Apache License 2.0
705 stars 158 forks source link

访问固定站点,会报Response body is unavailable for redirect responses错误 #18

Closed SmectaWang closed 4 years ago

SmectaWang commented 4 years ago

首先说一下,我用的是1.0.5版本 为什么要用老版本呢?因为新版本会出现一个奇怪的问题,就是高并发打开浏览器Page后,会出现卡死的情况。 具体情况是执行Page.goto()后,请求一直处于pending状态,而且触发这种情况后,后续无法恢复,重启浏览器也没用。所以只能选择老版本了。

现在访问固定站点会100%出现这个异常,而且无法处理,最终导致Page.close()无效,页面就泄漏在那无法关闭。 站点为: http:和谐www.baoji.gov.cn/col/col261/index.html

com.ruiyun.jvppeteer.exception.ProtocolException: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:216) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:260) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:28) at com.ruiyun.jvppeteer.transport.WebSocketTransport.onMessage(WebSocketTransport.java:47) at org.java_websocket.client.WebSocketClient.onWebsocketMessage(WebSocketClient.java:591) at org.java_websocket.drafts.Draft_6455.processFrameText(Draft_6455.java:885) at org.java_websocket.drafts.Draft_6455.processFrame(Draft_6455.java:819) at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:379) at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:216) at org.java_websocket.client.WebSocketClient.run(WebSocketClient.java:508) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.core.page.NetworkManager.handleRequestRedirect(NetworkManager.java:324) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequest(NetworkManager.java:310) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequestPaused(NetworkManager.java:297) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:79) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:75) at com.ruiyun.jvppeteer.events.EventEmitter.invokeListener(EventEmitter.java:131) at com.ruiyun.jvppeteer.events.EventEmitter.emit(EventEmitter.java:106) at com.ruiyun.jvppeteer.transport.CDPSession.onMessage(CDPSession.java:193) at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:182) ... 10 common frames omitted

fanyong920 commented 4 years ago

1.0.8可能存在卡死,试试1.0.9

------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午2:02 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18)

首先说一下,我用的是1.0.5版本 为什么要用老版本呢?因为新版本会出现一个奇怪的问题,就是高并发打开浏览器Page后,会出现卡死的情况。 具体情况是执行Page.goto()后,请求一直处于pending状态,而且触发这种情况后,后续无法恢复,重启浏览器也没用。所以只能选择老版本了。

现在访问固定站点会100%出现这个异常,而且无法处理,最终导致Page.close()无效,页面就泄漏在那无法关闭。 站点为: http:和谐www.baoji.gov.cn/col/col261/index.html

com.ruiyun.jvppeteer.exception.ProtocolException: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:216) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:260) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:28) at com.ruiyun.jvppeteer.transport.WebSocketTransport.onMessage(WebSocketTransport.java:47) at org.java_websocket.client.WebSocketClient.onWebsocketMessage(WebSocketClient.java:591) at org.java_websocket.drafts.Draft_6455.processFrameText(Draft_6455.java:885) at org.java_websocket.drafts.Draft_6455.processFrame(Draft_6455.java:819) at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:379) at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:216) at org.java_websocket.client.WebSocketClient.run(WebSocketClient.java:508) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.core.page.NetworkManager.handleRequestRedirect(NetworkManager.java:324) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequest(NetworkManager.java:310) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequestPaused(NetworkManager.java:297) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:79) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:75) at com.ruiyun.jvppeteer.events.EventEmitter.invokeListener(EventEmitter.java:131) at com.ruiyun.jvppeteer.events.EventEmitter.emit(EventEmitter.java:106) at com.ruiyun.jvppeteer.transport.CDPSession.onMessage(CDPSession.java:193) at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:182) ... 10 common frames omitted

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

SmectaWang commented 4 years ago

哥,刚试了一下,1.0.9还是会卡死。。。就是所有请求一直(pending)。。。。。 现在我试验,就1.0.5能长时间运行。。。不过就是有我发的那个错误 访问www.baoji.gov.cn/col/col261/index.html 报Response body is unavailable for redirect responses

1.0.8可能存在卡死,试试1.0.9 ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午2:02 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 首先说一下,我用的是1.0.5版本 为什么要用老版本呢?因为新版本会出现一个奇怪的问题,就是高并发打开浏览器Page后,会出现卡死的情况。 具体情况是执行Page.goto()后,请求一直处于pending状态,而且触发这种情况后,后续无法恢复,重启浏览器也没用。所以只能选择老版本了。 现在访问固定站点会100%出现这个异常,而且无法处理,最终导致Page.close()无效,页面就泄漏在那无法关闭。 站点为: http:和谐www.baoji.gov.cn/col/col261/index.html com.ruiyun.jvppeteer.exception.ProtocolException: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:216) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:260) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:28) at com.ruiyun.jvppeteer.transport.WebSocketTransport.onMessage(WebSocketTransport.java:47) at org.java_websocket.client.WebSocketClient.onWebsocketMessage(WebSocketClient.java:591) at org.java_websocket.drafts.Draft_6455.processFrameText(Draft_6455.java:885) at org.java_websocket.drafts.Draft_6455.processFrame(Draft_6455.java:819) at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:379) at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:216) at org.java_websocket.client.WebSocketClient.run(WebSocketClient.java:508) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.core.page.NetworkManager.handleRequestRedirect(NetworkManager.java:324) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequest(NetworkManager.java:310) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequestPaused(NetworkManager.java:297) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:79) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:75) at com.ruiyun.jvppeteer.events.EventEmitter.invokeListener(EventEmitter.java:131) at com.ruiyun.jvppeteer.events.EventEmitter.emit(EventEmitter.java:106) at com.ruiyun.jvppeteer.transport.CDPSession.onMessage(CDPSession.java:193) at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:182) ... 10 common frames omitted — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

fanyong920 commented 4 years ago

一直pending是浏览器的页面一直在pending吗,还是代码里面也跑不动了,线程卡死

------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午4:21 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18)

哥,刚试了一下,1.0.9还是会卡死。。。就是所有请求一直(pending)。。。。。 现在我试验,就1.0.5能长时间运行。。。不过就是有我发的那个错误 访问www.baoji.gov.cn/col/col261/index.html 报Response body is unavailable for redirect responses

1.0.8可能存在卡死,试试1.0.9 … ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午2:02 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 首先说一下,我用的是1.0.5版本 为什么要用老版本呢?因为新版本会出现一个奇怪的问题,就是高并发打开浏览器Page后,会出现卡死的情况。 具体情况是执行Page.goto()后,请求一直处于pending状态,而且触发这种情况后,后续无法恢复,重启浏览器也没用。所以只能选择老版本了。 现在访问固定站点会100%出现这个异常,而且无法处理,最终导致Page.close()无效,页面就泄漏在那无法关闭。 站点为: http:和谐www.baoji.gov.cn/col/col261/index.html com.ruiyun.jvppeteer.exception.ProtocolException: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:216) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:260) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:28) at com.ruiyun.jvppeteer.transport.WebSocketTransport.onMessage(WebSocketTransport.java:47) at org.java_websocket.client.WebSocketClient.onWebsocketMessage(WebSocketClient.java:591) at org.java_websocket.drafts.Draft_6455.processFrameText(Draft_6455.java:885) at org.java_websocket.drafts.Draft_6455.processFrame(Draft_6455.java:819) at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:379) at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:216) at org.java_websocket.client.WebSocketClient.run(WebSocketClient.java:508) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.core.page.NetworkManager.handleRequestRedirect(NetworkManager.java:324) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequest(NetworkManager.java:310) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequestPaused(NetworkManager.java:297) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:79) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:75) at com.ruiyun.jvppeteer.events.EventEmitter.invokeListener(EventEmitter.java:131) at com.ruiyun.jvppeteer.events.EventEmitter.emit(EventEmitter.java:106) at com.ruiyun.jvppeteer.transport.CDPSession.onMessage(CDPSession.java:193) at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:182) ... 10 common frames omitted — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

SmectaWang commented 4 years ago

是浏览器请求pending。 我是打开了调试工具,访问网址的请求直接不动了,程序可以执行,最后全是Timeout。重开一个浏览器并不能解决。 但是手动在浏览器里新建一个Page输入一个网址,是可以打开的。 感觉是Page这块有问题。

截了个图。 (U%F% 1C4{99_DMIN52F 3L

一直pending是浏览器的页面一直在pending吗,还是代码里面也跑不动了,线程卡死 ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午4:21 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 哥,刚试了一下,1.0.9还是会卡死。。。就是所有请求一直(pending)。。。。。 现在我试验,就1.0.5能长时间运行。。。不过就是有我发的那个错误 访问www.baoji.gov.cn/col/col261/index.html 报Response body is unavailable for redirect responses 1.0.8可能存在卡死,试试1.0.9 … ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午2:02 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 首先说一下,我用的是1.0.5版本 为什么要用老版本呢?因为新版本会出现一个奇怪的问题,就是高并发打开浏览器Page后,会出现卡死的情况。 具体情况是执行Page.goto()后,请求一直处于pending状态,而且触发这种情况后,后续无法恢复,重启浏览器也没用。所以只能选择老版本了。 现在访问固定站点会100%出现这个异常,而且无法处理,最终导致Page.close()无效,页面就泄漏在那无法关闭。 站点为: http:和谐www.baoji.gov.cn/col/col261/index.html com.ruiyun.jvppeteer.exception.ProtocolException: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:216) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:260) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:28) at com.ruiyun.jvppeteer.transport.WebSocketTransport.onMessage(WebSocketTransport.java:47) at org.java_websocket.client.WebSocketClient.onWebsocketMessage(WebSocketClient.java:591) at org.java_websocket.drafts.Draft_6455.processFrameText(Draft_6455.java:885) at org.java_websocket.drafts.Draft_6455.processFrame(Draft_6455.java:819) at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:379) at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:216) at org.java_websocket.client.WebSocketClient.run(WebSocketClient.java:508) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.core.page.NetworkManager.handleRequestRedirect(NetworkManager.java:324) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequest(NetworkManager.java:310) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequestPaused(NetworkManager.java:297) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:79) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:75) at com.ruiyun.jvppeteer.events.EventEmitter.invokeListener(EventEmitter.java:131) at com.ruiyun.jvppeteer.events.EventEmitter.emit(EventEmitter.java:106) at com.ruiyun.jvppeteer.transport.CDPSession.onMessage(CDPSession.java:193) at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:182) ... 10 common frames omitted — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

fanyong920 commented 4 years ago

项目代码能给看看吗,不然不知道如何定位

------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午4:39 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18)

是浏览器请求pending。 我是打开了调试工具,访问网址的请求直接不动了,程序可以执行,最后全是Timeout。重开一个浏览器并不能解决。 但是手动在浏览器里新建一个Page输入一个网址,是可以打开的。 感觉是Page这块有问题。

截了个图。

一直pending是浏览器的页面一直在pending吗,还是代码里面也跑不动了,线程卡死 … ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午4:21 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 哥,刚试了一下,1.0.9还是会卡死。。。就是所有请求一直(pending)。。。。。 现在我试验,就1.0.5能长时间运行。。。不过就是有我发的那个错误 访问www.baoji.gov.cn/col/col261/index.html 报Response body is unavailable for redirect responses 1.0.8可能存在卡死,试试1.0.9 … ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月22日(星期二) 下午2:02 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "Subscribed"<subscribed@noreply.github.com>; 主题: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 首先说一下,我用的是1.0.5版本 为什么要用老版本呢?因为新版本会出现一个奇怪的问题,就是高并发打开浏览器Page后,会出现卡死的情况。 具体情况是执行Page.goto()后,请求一直处于pending状态,而且触发这种情况后,后续无法恢复,重启浏览器也没用。所以只能选择老版本了。 现在访问固定站点会100%出现这个异常,而且无法处理,最终导致Page.close()无效,页面就泄漏在那无法关闭。 站点为: http:和谐www.baoji.gov.cn/col/col261/index.html com.ruiyun.jvppeteer.exception.ProtocolException: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:216) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:260) at com.ruiyun.jvppeteer.transport.Connection.accept(Connection.java:28) at com.ruiyun.jvppeteer.transport.WebSocketTransport.onMessage(WebSocketTransport.java:47) at org.java_websocket.client.WebSocketClient.onWebsocketMessage(WebSocketClient.java:591) at org.java_websocket.drafts.Draft_6455.processFrameText(Draft_6455.java:885) at org.java_websocket.drafts.Draft_6455.processFrame(Draft_6455.java:819) at org.java_websocket.WebSocketImpl.decodeFrames(WebSocketImpl.java:379) at org.java_websocket.WebSocketImpl.decode(WebSocketImpl.java:216) at org.java_websocket.client.WebSocketClient.run(WebSocketClient.java:508) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Response body is unavailable for redirect responses at com.ruiyun.jvppeteer.core.page.NetworkManager.handleRequestRedirect(NetworkManager.java:324) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequest(NetworkManager.java:310) at com.ruiyun.jvppeteer.core.page.NetworkManager.onRequestPaused(NetworkManager.java:297) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:79) at com.ruiyun.jvppeteer.core.page.NetworkManager$1.onBrowserEvent(NetworkManager.java:75) at com.ruiyun.jvppeteer.events.EventEmitter.invokeListener(EventEmitter.java:131) at com.ruiyun.jvppeteer.events.EventEmitter.emit(EventEmitter.java:106) at com.ruiyun.jvppeteer.transport.CDPSession.onMessage(CDPSession.java:193) at com.ruiyun.jvppeteer.transport.Connection.onMessage(Connection.java:182) ... 10 common frames omitted — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

fanyong920 commented 4 years ago

清理一下windows下的临时文件夹的文件

SmectaWang commented 4 years ago

清理一下windows下的临时文件夹的文件

哥我回来了。。。。清理临时文件后,version 1.0.9 跑了半天,最后还是挂了。。就是所有页面都(Pending)了 我还发现一个现象,就是当这一堆页面(Pending)卡住以后,把JAVA进程杀掉,就继续加载了。。。。。

我把代码整理了一下。 大概就是这样的一个Service。 现在继续换回version 1.0.5了,比较稳定。。。。

` public class RenderService2 {

static LaunchOptions options;
static ArrayList<String> argList = new ArrayList<>();

static {
    options = new LaunchOptionsBuilder()
            .withDevtools(true)
            .withHeadless(false)
            .withArgs(argList)
            .build();
}

static volatile Browser browser = null;

public String renderPage(String address) throws RenderException {
    if(StringUtils.isEmpty(address)){
        return "It is NULL that the URL was parsed !";
    }

    Page page = null;

    try {
        if (browser == null) {
            synchronized (Browser.class) {
                if (browser == null) {
                    browser = Puppeteer.launch(options);
                    System.out.println("Launch a Browser successfully!");
                }
            }
        }

        FutureTask<Page> future = new FutureTask<Page>(new GetPageTask(browser));
        new Thread(future).start();
        page = browser.newPage();

        page.setDefaultNavigationTimeout(25000);
        page.setRequestInterception(true);
        HashSet<String> set = new HashSet<>();
        page.onRequest(request -> {
            String type = request.resourceType();
            String url = request.url().toLowerCase();
            if(set.contains(url)){
                //重复请求放行,防止死循环
                request.continueRequest();
                return;
            }

            if ("media".equals(type) || "stylesheet".equals(type)
                    || "font".equals(type) ) {
                //需要拒绝的请求
                set.add(url);
                request.abort();
                return;
            }
            if("image".equals(type)||url.contains(".png")||url.contains(".gif")
                    ||url.contains(".jpeg")||url.contains(".jpg")||url.contains(".ico")){
                set.add(url);
                request.abort();
                return;
            }
            request.continueRequest();
        });

        //尝试避免webDriver检测
        //page.evaluateOnNewDocument("() =>{ Object.defineProperties(navigator,{ webdriver:{ get: () => undefined } }) }");
        page.evaluateOnNewDocument("() =>{ window.navigator.chrome = { runtime: {},  }; }", PageEvaluateType.FUNCTION, "");
        page.evaluateOnNewDocument("() =>{ Object.defineProperty(navigator, 'languages', { get: () => ['en-US', 'en'] }); }", PageEvaluateType.FUNCTION, "");
        page.evaluateOnNewDocument("() =>{ Object.defineProperty(navigator, 'plugins', { get: () => [1, 2, 3, 4, 5,6], }); }", PageEvaluateType.FUNCTION, "");
        page.evaluateOnNewDocument("() =>{ const newProto = navigator.__proto__;delete newProto.webdriver;navigator.__proto__ = newProto;}", PageEvaluateType.FUNCTION, "");

        System.out.println("Open a page goto:".concat(address));
        page.goTo(address);

        FutureTask<String> futureTask = new FutureTask<String>(new GetPageContentTask(page));
        new Thread(futureTask).start();
        String c = futureTask.get(5,TimeUnit.SECONDS);

        if(StringUtils.isEmpty(c)){
            throw new Exception();
        }

        return c;

    } catch (Throwable e){
        e.printStackTrace();
        return e.getMessage();

    } finally {
        try {
            if(page!=null) {
                page.close();
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

    }
}

public static void main(String[] args) {

    RenderService2 r = new RenderService2();
    try {
        System.out.println(r.renderPage("http://www.baoji.gov.cn/col/col263/index.html"));

    } catch (Throwable e) {
        e.printStackTrace();
    }

}`
fanyong920 commented 4 years ago

昨晚我提交了代码,不嫌弃麻烦的话,你把我的源代码下载下来,然后打成jar包,不用maven的方式了,再试试。

------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月23日(星期三) 下午4:47 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18)

清理一下windows下的临时文件夹的文件

哥我回来了。。。。清理临时文件后,version 1.0.9 跑了半天,最后还是挂了。。就是所有页面都(Pending)了 我还发现一个现象,就是当这一堆页面(Pending)卡住以后,把JAVA进程杀掉,就继续加载了。。。。。

我把代码整理了一下。 大概就是这样的一个Service。 现在继续换回version 1.0.5了,比较稳定。。。。

public class RenderService2 { static LaunchOptions options; static ArrayList<String&gt; argList = new ArrayList<&gt;(); static { options = new LaunchOptionsBuilder() .withDevtools(true) .withHeadless(false) .withArgs(argList) .build(); } static volatile Browser browser = null; public String renderPage(String address) throws RenderException { if(StringUtils.isEmpty(address)){ return "It is NULL that the URL was parsed !"; } Page page = null; try { if (browser == null) { synchronized (Browser.class) { if (browser == null) { browser = Puppeteer.launch(options); System.out.println("Launch a Browser successfully!"); } } } FutureTask<Page&gt; future = new FutureTask<Page&gt;(new GetPageTask(browser)); new Thread(future).start(); page = browser.newPage(); page.setDefaultNavigationTimeout(25000); page.setRequestInterception(true); HashSet<String&gt; set = new HashSet<&gt;(); page.onRequest(request -&gt; { String type = request.resourceType(); String url = request.url().toLowerCase(); if(set.contains(url)){ //重复请求放行,防止死循环 request.continueRequest(); return; } if ("media".equals(type) || "stylesheet".equals(type) || "font".equals(type) ) { //需要拒绝的请求 set.add(url); request.abort(); return; } if("image".equals(type)||url.contains(".png")||url.contains(".gif") ||url.contains(".jpeg")||url.contains(".jpg")||url.contains(".ico")){ set.add(url); request.abort(); return; } request.continueRequest(); }); //尝试避免webDriver检测 //page.evaluateOnNewDocument("() =&gt;{ Object.defineProperties(navigator,{ webdriver:{ get: () =&gt; undefined } }) }"); page.evaluateOnNewDocument("() =&gt;{ window.navigator.chrome = { runtime: {}, }; }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ Object.defineProperty(navigator, 'languages', { get: () =&gt; ['en-US', 'en'] }); }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ Object.defineProperty(navigator, 'plugins', { get: () =&gt; [1, 2, 3, 4, 5,6], }); }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ const newProto = navigator.__proto__;delete newProto.webdriver;navigator.__proto__ = newProto;}", PageEvaluateType.FUNCTION, ""); System.out.println("Open a page goto:".concat(address)); page.goTo(address); FutureTask<String&gt; futureTask = new FutureTask<String&gt;(new GetPageContentTask(page)); new Thread(futureTask).start(); String c = futureTask.get(5,TimeUnit.SECONDS); if(StringUtils.isEmpty(c)){ throw new Exception(); } return c; } catch (Throwable e){ e.printStackTrace(); return e.getMessage(); } finally { try { if(page!=null) { page.close(); } } catch (InterruptedException e) { e.printStackTrace(); } } } public static void main(String[] args) { RenderService2 r = new RenderService2(); try { System.out.println(r.renderPage("http://www.baoji.gov.cn/col/col263/index.html")); } catch (Throwable e) { e.printStackTrace(); } }
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

SmectaWang commented 4 years ago

我回来了。。当前的代码这两天测了一下,没问题了。 等你发布了,谢谢。 这个问题不存在了,关闭啦~

昨晚我提交了代码,不嫌弃麻烦的话,你把我的源代码下载下来,然后打成jar包,不用maven的方式了,再试试。 ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月23日(星期三) 下午4:47 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 清理一下windows下的临时文件夹的文件 哥我回来了。。。。清理临时文件后,version 1.0.9 跑了半天,最后还是挂了。。就是所有页面都(Pending)了 我还发现一个现象,就是当这一堆页面(Pending)卡住以后,把JAVA进程杀掉,就继续加载了。。。。。 我把代码整理了一下。 大概就是这样的一个Service。 现在继续换回version 1.0.5了,比较稳定。。。。 public class RenderService2 { static LaunchOptions options; static ArrayList<String&gt; argList = new ArrayList<&gt;(); static { options = new LaunchOptionsBuilder() .withDevtools(true) .withHeadless(false) .withArgs(argList) .build(); } static volatile Browser browser = null; public String renderPage(String address) throws RenderException { if(StringUtils.isEmpty(address)){ return "It is NULL that the URL was parsed !"; } Page page = null; try { if (browser == null) { synchronized (Browser.class) { if (browser == null) { browser = Puppeteer.launch(options); System.out.println("Launch a Browser successfully!"); } } } FutureTask<Page&gt; future = new FutureTask<Page&gt;(new GetPageTask(browser)); new Thread(future).start(); page = browser.newPage(); page.setDefaultNavigationTimeout(25000); page.setRequestInterception(true); HashSet<String&gt; set = new HashSet<&gt;(); page.onRequest(request -&gt; { String type = request.resourceType(); String url = request.url().toLowerCase(); if(set.contains(url)){ //重复请求放行,防止死循环 request.continueRequest(); return; } if ("media".equals(type) || "stylesheet".equals(type) || "font".equals(type) ) { //需要拒绝的请求 set.add(url); request.abort(); return; } if("image".equals(type)||url.contains(".png")||url.contains(".gif") ||url.contains(".jpeg")||url.contains(".jpg")||url.contains(".ico")){ set.add(url); request.abort(); return; } request.continueRequest(); }); //尝试避免webDriver检测 //page.evaluateOnNewDocument("() =&gt;{ Object.defineProperties(navigator,{ webdriver:{ get: () =&gt; undefined } }) }"); page.evaluateOnNewDocument("() =&gt;{ window.navigator.chrome = { runtime: {}, }; }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ Object.defineProperty(navigator, 'languages', { get: () =&gt; ['en-US', 'en'] }); }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ Object.defineProperty(navigator, 'plugins', { get: () =&gt; [1, 2, 3, 4, 5,6], }); }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ const newProto = navigator.__proto__;delete newProto.webdriver;navigator.__proto__ = newProto;}", PageEvaluateType.FUNCTION, ""); System.out.println("Open a page goto:".concat(address)); page.goTo(address); FutureTask<String&gt; futureTask = new FutureTask<String&gt;(new GetPageContentTask(page)); new Thread(futureTask).start(); String c = futureTask.get(5,TimeUnit.SECONDS); if(StringUtils.isEmpty(c)){ throw new Exception(); } return c; } catch (Throwable e){ e.printStackTrace(); return e.getMessage(); } finally { try { if(page!=null) { page.close(); } } catch (InterruptedException e) { e.printStackTrace(); } } } public static void main(String[] args) { RenderService2 r = new RenderService2(); try { System.out.println(r.renderPage("http://www.baoji.gov.cn/col/col263/index.html")); } catch (Throwable e) { e.printStackTrace(); } } — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

昨晚我提交了代码,不嫌弃麻烦的话,你把我的源代码下载下来,然后打成jar包,不用maven的方式了,再试试。 ------------------ 原始邮件 ------------------ 发件人: "fanyong920/jvppeteer" <notifications@github.com>; 发送时间: 2020年9月23日(星期三) 下午4:47 收件人: "fanyong920/jvppeteer"<jvppeteer@noreply.github.com>; 抄送: "樊勇"<1023079644@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [fanyong920/jvppeteer] 访问固定站点,会报Response body is unavailable for redirect responses错误 (#18) 清理一下windows下的临时文件夹的文件 哥我回来了。。。。清理临时文件后,version 1.0.9 跑了半天,最后还是挂了。。就是所有页面都(Pending)了 我还发现一个现象,就是当这一堆页面(Pending)卡住以后,把JAVA进程杀掉,就继续加载了。。。。。 我把代码整理了一下。 大概就是这样的一个Service。 现在继续换回version 1.0.5了,比较稳定。。。。 public class RenderService2 { static LaunchOptions options; static ArrayList<String&gt; argList = new ArrayList<&gt;(); static { options = new LaunchOptionsBuilder() .withDevtools(true) .withHeadless(false) .withArgs(argList) .build(); } static volatile Browser browser = null; public String renderPage(String address) throws RenderException { if(StringUtils.isEmpty(address)){ return "It is NULL that the URL was parsed !"; } Page page = null; try { if (browser == null) { synchronized (Browser.class) { if (browser == null) { browser = Puppeteer.launch(options); System.out.println("Launch a Browser successfully!"); } } } FutureTask<Page&gt; future = new FutureTask<Page&gt;(new GetPageTask(browser)); new Thread(future).start(); page = browser.newPage(); page.setDefaultNavigationTimeout(25000); page.setRequestInterception(true); HashSet<String&gt; set = new HashSet<&gt;(); page.onRequest(request -&gt; { String type = request.resourceType(); String url = request.url().toLowerCase(); if(set.contains(url)){ //重复请求放行,防止死循环 request.continueRequest(); return; } if ("media".equals(type) || "stylesheet".equals(type) || "font".equals(type) ) { //需要拒绝的请求 set.add(url); request.abort(); return; } if("image".equals(type)||url.contains(".png")||url.contains(".gif") ||url.contains(".jpeg")||url.contains(".jpg")||url.contains(".ico")){ set.add(url); request.abort(); return; } request.continueRequest(); }); //尝试避免webDriver检测 //page.evaluateOnNewDocument("() =&gt;{ Object.defineProperties(navigator,{ webdriver:{ get: () =&gt; undefined } }) }"); page.evaluateOnNewDocument("() =&gt;{ window.navigator.chrome = { runtime: {}, }; }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ Object.defineProperty(navigator, 'languages', { get: () =&gt; ['en-US', 'en'] }); }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ Object.defineProperty(navigator, 'plugins', { get: () =&gt; [1, 2, 3, 4, 5,6], }); }", PageEvaluateType.FUNCTION, ""); page.evaluateOnNewDocument("() =&gt;{ const newProto = navigator.__proto__;delete newProto.webdriver;navigator.__proto__ = newProto;}", PageEvaluateType.FUNCTION, ""); System.out.println("Open a page goto:".concat(address)); page.goTo(address); FutureTask<String&gt; futureTask = new FutureTask<String&gt;(new GetPageContentTask(page)); new Thread(futureTask).start(); String c = futureTask.get(5,TimeUnit.SECONDS); if(StringUtils.isEmpty(c)){ throw new Exception(); } return c; } catch (Throwable e){ e.printStackTrace(); return e.getMessage(); } finally { try { if(page!=null) { page.close(); } } catch (InterruptedException e) { e.printStackTrace(); } } } public static void main(String[] args) { RenderService2 r = new RenderService2(); try { System.out.println(r.renderPage("http://www.baoji.gov.cn/col/col263/index.html")); } catch (Throwable e) { e.printStackTrace(); } } — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.