WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
Exception when updating db, java.lang.InterruptedException,org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died. #83
爬虫过程中出现了这个问题,查到了和org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died.有关的问题,但是也没有好的解决方法,说是和tcp的连接有关,可是我查了服务器的tcp连接没有关系的。如果你有见解的话烦请指点,谢谢!
Error communicating with the remote browser. It may have died.
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'hadoop3.test.yunwei.puppet.dh', ip: '192.168.112.49', os.name: 'Linux', os.arch: 'amd64', os.version: '2.6.32-696.1.1.el6.x86_64', java.version: '1.8.0_141'
Driver info: driver.version: RemoteWebDriver
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:589)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:610)
at org.openqa.selenium.remote.RemoteWebDriver.quit(RemoteWebDriver.java:464)
at cn.dianhun.qimai.QiMaiCrawler.visit(QiMaiCrawler.java:253)
at cn.edu.hfut.dmic.webcollector.crawler.AutoParseCrawler.execute(AutoParseCrawler.java:78)
at cn.edu.hfut.dmic.webcollector.fetcher.Fetcher$FetcherThread.run(Fetcher.java:242) ]
[ ]
`Causedby: java.lang.RuntimeException: java.lang.InterruptedException
at com.google.common.base.Throwables.propagate(Throwables.java:160)
at org.openqa.selenium.net.UrlChecker.waitUntilUnavailable(UrlChecker.java:145)
at org.openqa.selenium.remote.service.DriverService.stop(DriverService.java:184)
at org.openqa.selenium.phantomjs.PhantomJSCommandExecutor.execute(PhantomJSCommandExecutor.java:94)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:568)
... 5 more
[ Caused by: java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:204)
at com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:130)
at org.openqa.selenium.net.UrlChecker.waitUntilUnavailable(UrlChecker.java:117)
... 8 more
INFO [Thread-7] - Exception when updating db
java.lang.IllegalStateException: Can't call Database.put Database was closed.
at com.sleepycat.je.Database.checkOpen(Database.java:1863)
at com.sleepycat.je.Database.put(Database.java:1168)
at cn.edu.hfut.dmic.webcollector.plugin.berkeley.BerkeleyDBUtils.put(BerkeleyDBUtils.java:52)
at cn.edu.hfut.dmic.webcollector.plugin.berkeley.BerkeleyDBUtils.writeDatum(BerkeleyDBUtils.java:48)
at cn.edu.hfut.dmic.webcollector.plugin.berkeley.BerkeleyDBManager.writeFetchSegment(BerkeleyDBManager.java:146)
at cn.edu.hfut.dmic.webcollector.fetcher.Fetcher$FetcherThread.run(Fetcher.java:263)]
爬虫过程中出现了这个问题,查到了和org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died.有关的问题,但是也没有好的解决方法,说是和tcp的连接有关,可是我查了服务器的tcp连接没有关系的。如果你有见解的话烦请指点,谢谢!
[org.openqa.selenium.remote.UnreachableBrowserException:
Error communicating with the remote browser. It may have died. Build info: version: 'unknown', revision: 'unknown', time: 'unknown' System info: host: 'hadoop3.test.yunwei.puppet.dh', ip: '192.168.112.49', os.name: 'Linux', os.arch: 'amd64', os.version: '2.6.32-696.1.1.el6.x86_64', java.version: '1.8.0_141' Driver info: driver.version: RemoteWebDriver at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:589) at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:610) at org.openqa.selenium.remote.RemoteWebDriver.quit(RemoteWebDriver.java:464) at cn.dianhun.qimai.QiMaiCrawler.visit(QiMaiCrawler.java:253) at cn.edu.hfut.dmic.webcollector.crawler.AutoParseCrawler.execute(AutoParseCrawler.java:78) at cn.edu.hfut.dmic.webcollector.fetcher.Fetcher$FetcherThread.run(Fetcher.java:242) ]
[ ]
`Causedby: java.lang.RuntimeException: java.lang.InterruptedException at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.openqa.selenium.net.UrlChecker.waitUntilUnavailable(UrlChecker.java:145) at org.openqa.selenium.remote.service.DriverService.stop(DriverService.java:184) at org.openqa.selenium.phantomjs.PhantomJSCommandExecutor.execute(PhantomJSCommandExecutor.java:94) at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:568) ... 5 more
[ Caused by: java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) at java.util.concurrent.FutureTask.get(FutureTask.java:204) at com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:130) at org.openqa.selenium.net.UrlChecker.waitUntilUnavailable(UrlChecker.java:117) ... 8 more INFO [Thread-7] - Exception when updating db java.lang.IllegalStateException: Can't call Database.put Database was closed. at com.sleepycat.je.Database.checkOpen(Database.java:1863) at com.sleepycat.je.Database.put(Database.java:1168) at cn.edu.hfut.dmic.webcollector.plugin.berkeley.BerkeleyDBUtils.put(BerkeleyDBUtils.java:52) at cn.edu.hfut.dmic.webcollector.plugin.berkeley.BerkeleyDBUtils.writeDatum(BerkeleyDBUtils.java:48) at cn.edu.hfut.dmic.webcollector.plugin.berkeley.BerkeleyDBManager.writeFetchSegment(BerkeleyDBManager.java:146) at cn.edu.hfut.dmic.webcollector.fetcher.Fetcher$FetcherThread.run(Fetcher.java:263)]